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CJ (54) Title: MASSIVE PARALLEL METHOD FOR DECODING DNA AND RNA 
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(57) Abstract: This invention provides methods for attaching a nucleic acid to a solid surface and for sequencing nucleic acid by 
Q detecting the identity of each nucleotide analogue after the nucleotide analogue is incorporated into a growing strand of DNA in 
£^ a polymerase reaction. The invention also provides nucleotide analogues which comprise unique labels attached to the nucleotide 
^ analogue through a cleavable linker, and a cleavable chemical group to cap the -OH group at the 3 '-position of the deoxyribose. 
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MASSIVE PARALLEL METHOD FOR DECODING DNA AND RNA 

5 This application claims the benefit of U.S. Provisional 

Application No. 60/300,894, filed June 26, 2001, and is 
a continuation-in-part of U.S. Serial No. 09/684,670, 
filed October 6, 2000, the contents of both of which are 
hereby incorporated by reference in their entireties 
10 into this application. 

Background Of The Invention 

Throughout this application, various publications are 
referenced in parentheses by author and year. Full 
citations for these references may be found at the end 
of the specification immediately preceding the claims. 
The disclosures of these publications in their 
entireties are hereby incorporated by reference into 
this application to more fully describe the state of the 
art to which this invention pertains. 

The ability to sequence deoxyribonucleic acid (DNA) 
accurately and rapidly is revolutionizing biology and 
25 medicine. The confluence of the massive Human Genome 

Project is driving an exponential growth in the 
development of high throughput genetic analysis 
technologies. This rapid technological development 
involving chemistry, engineering, biology, and computer 
30 science makes it possible to move from studying single 

genes at a time to analyzing and comparing entire 
genomes . 



15 



20 
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With the completion of the first entire human genome 
sequence map, many areas in the genome that are highly 
polymorphic in both exons and introns will be known. 
5 The pharmacogenomics challenge is to comprehensively 

identify the genes and functional polymorphisms 
associated with the variability in drug response (Roses, 
2000) . Resequencing of polymorphic areas in the genome 
that are linked to disease development will contribute 

10 greatly to the understanding of diseases, such as 

cancer, and therapeutic development. Thus, high- 

throughput accurate methods for resequencing the highly 
variable intron/exon regions of the genome are needed in 
order to explore the full potential of the complete 

15 human genome sequence map. The current state-of-the-art 

technology for high throughput DNA sequencing, such as 
used for the Human Genome Project (Pennisi 2000), is 
capillary array DNA sequencers using laser induced 
fluorescence detection (Smith et al., 1986; Ju et al . 

20 1995, 1996; Kheterpal et al . 1996; Salas-Solano et al . 

1998) . Improvements in the polymerase that lead to 
uniform termination efficiency and the introduction of 
thermostable polymerases have also significantly improved 
the quality of sequencing data (Tabor and Richardson, 

25 1987, 1995) . Although capillary array DNA sequencing 

technology to some extent addresses the throughput and 
read length requirements of large scale DNA sequencing 
projects, the throughput and accuracy required for 
mutation studies needs to be improved for a wide variety 

30 of applications ranging from disease gene discovery to 

forensic identification. For example, electrophoresis 
based DNA sequencing methods have difficulty detecting 
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heterozygotes unambiguously and are not 100% accurate in 
regions rich in nucleotides comprising guanine or 
cytosine due to compressions (Bowling et al . 1991; 
Yamakawa et al . 1997). In addition, the first few bases 
5 after the priming site are often masked by the high 

fluorescence signal from excess dye-labeled primers or 
dye-labeled terminators, and are therefore difficult to 
identify. Therefore, the requirement of electrophoresis 
for DNA sequencing is still the bottleneck for high- 
10 throughput DNA sequencing and mutation detection 

projects . 



The concept of sequencing DNA by synthesis without using 
electrophoresis was first revealed in 1988 (Hyman, 1988) 

15 and involves detecting the identity of each nucleotide 

as it is incorporated into the growing strand of DNA in 
a polymerase reaction. Such a scheme coupled with the 
chip format and laser-induced fluorescent detection has 
the potential to markedly increase the throughput of DNA 

20 sequencing projects. Consequently, several groups have 

investigated such a system with an aim to construct an 
ultra high-throughput DNA sequencing procedure 
(Cheeseman 1994, Metzker et al . 1994). Thus far, no 
complete success of using such a system to unambiguously 

25 sequence DNA has been reported. The pyrosequencing 

approach that employs four natural nucleotides 
(comprising a base of adenine (A) , cytosine (C) , guanine 
(G) , or thymine (T) ) and several other enzymes for 
sequencing DNA by synthesis is now widely used for 

30 mutation detection (Ronaghi 1998) . In this approach, 

the detection is based on the pyrophosphate (PPi) 
released during the DNA polymerase reaction, the 
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quantitative conversion of pyrophosphate to adenosine 
triphosphate (ATP) by sulfurylase, and the subsequent 
production of visible light by firefly lucif erase. This 
procedure can only sequence up to 30 base pairs (bps) of 
5 nucleotide sequences, and each of the 4 nucleotides 

needs to be added separately and detected separately. 
Long stretches of the same bases cannot be identified 
unambiguously with the pyrosequencing method. 

10 More recent work in the literature exploring DNA 

sequencing by a synthesis method is mostly focused on 
designing and synthesizing a photocleavable chemical 
moiety that is linked to a fluorescent dye to cap the 
3' -OH group of deoxynucleoside triphosphates (dNTPs) 

15 (Welch et al. 1999). Limited success for the 

incorporation of the 3 '-modified nucleotide by DNA 
polymerase is reported. The reason is that the 3'- 
position on the deoxyribose is very close to the amino 
acid residues in the active site of the polymerase, and 

20 the polymerase is therefore sensitive to modification in 

this area of the deoxyribose ring. On the other hand, 
it is known that modified DNA polymerases (Thermo 
Sequenase and Taq FS polymerase) are able to recognize 
nucleotides with extensive modifications with bulky 

25 groups such as energy transfer dyes at the 5-position of 

the pyrimidines (T and C) and at the 7-position of 
purines (G and A) (Rosenblum et al . 1997, Zhu et al . 
1994). The ternary complexes of rat DNA polymerase, a 
DNA template-primer, and dideoxycytidine triphosphate 

30 (ddCTP) have been determined (Pelletier et al. 1994) 

which supports this fact. As shown in Figure 1, the 3-D 
structure indicates that the surrounding area of the 3 T - 
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position of the deoxyribose ring in ddCTP is very 
crowded, while there is ample space for modification on 
the 5-position the cytidine base. 

5 The approach disclosed in the present application is to 

make nucleotide analogues by linking a unique label such 
as a fluorescent dye or a mass tag through a cleavable 
linker to the nucleotide base or an analogue of the 
nucleotide base, such as to the 5-position of the 

10 pyrimidines (T and C) and to the 7-position of the 

purines (G and A) , to use a small cleavable chemical 
moiety to cap the 3 1 -OH group of the deoxyribose to make 
it nonreactive, and to incorporate the nucleotide 
analogues into the growing DNA strand as terminators. 

15 Detection of the unique label will yield the sequence 

identity of the nucleotide. Upon removing the label and 
the 3' -OH capping group, the polymerase reaction will 
proceed to incorporate the next nucleotide analogue and 
detect the next base. 

20 

It is also desirable to use a photocleavable group to 
cap the 3 ? -OH group. However, a photocleavable group is 
generally bulky and thus the DNA polymerase will have 
difficulty to incorporate the nucleotide analogues 

25 containing a photocleavable moiety capping the 3 1 -OH 

group. If small chemical moieties that can be easily 
cleaved chemically with high yield can be used to cap 
the 3 ! -OH group, such nucleotide analogues should also 
be recognized as substrates for DNA polymerase. It has 

3 0 been reported that 3 1 -O-methoxy-deoxynucleotides are 

good substrates for several polymerases (Axelrod et al. 
1978). 3 f -0-allyl-dATP was also shown to be 
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incorporated by Ventr(exo-) DNA polymerase in the 
growing strand of DNA (Metzker et al. 1994). However, 
the procedure to chemically cleave the methoxy group is 
stringent and requires anhydrous conditions. Thus, it 
5 is not practical to use a methoxy group to cap the 3 f -OH 

group for sequencing DNA by synthesis. An ester group 
was also explored to cap the 3' -OH group of the 
nucleotide, but it was shown to be cleaved by the 
nucleophiles in the active site in DNA polymerase 

10 (Canard et al. 1995). Chemical groups with 

electrophiles such as ketone groups are not suitable for 
protecting the 3'~OH of the nucleotide in enzymatic 
reactions due to the existence of strong nucleophiles in 
the polymerase. It is known that MOM (-CH 2 OCH 3 ) and 

15 allyl (-CH 2 CH=CH 2 ) groups can be used to cap an -OH 

group, and can be cleaved chemically with high yield 
(Ireland et al. 198 6; Kamal et al. 1999) . The approach 
disclosed in the present application is to incorporate 
■nucleotide analogues, which are labeled with cleavable, 

20 unique labels such as fluorescent dyes or mass tags and 

where the 3 1 -OH is capped with a cleavable chemical 
moiety such as either a MOM group (-CH 2 OCH 3 ) or an allyl 
group (~CH 2 CH=CH 2 ) , into the growing strand DNA as 
terminators. The optimized nucleotide set (3' -ro""A— u^beli , 

25 3'~Rcr*C-LABEL2/ 3' -ro~G""label3 / 3 r -ro""'I , ~i 1 abel4 r where R denotes the 

chemical group used to cap the 3' -OH) can then be used 
for DNA sequencing by the synthesis approach. 

There are many advantages of using mass spectrometry 
30 (MS) to detect small and stable molecules. For example, 

the mass resolution can be as good as one dalton. Thus, 
compared to gel electrophoresis sequencing systems and 
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the laser induced fluorescence detection approach which 
have overlapping fluorescence emission spectra, leading 
to heterozygote detection difficulty, the MS approach 
disclosed in this application produces very high 
resolution of sequencing data by detecting the cleaved 
small mass tags instead of the long DNA fragment. This 
method also produces extremely fast separation in the 
time scale of microseconds. The high resolution allows 
accurate digital mutation and heterozygote detection. 
Another advantage of sequencing- with mass spectrometry 
by detecting the small mass tags is that the 
compressions associated with gel based systems are 
completely eliminated. 



In order to maintain a continuous hybridized primer 
extension product with the template DNA, a primer that 
contains a stable loop to form an entity capable of 
self-priming in a polymerase reaction can be ligated to 
the 3* end of each single stranded DNA template that is 
20 immobilized on a solid surface such as a chip. This 

approach will solve the problem of washing off the 
growing extension products in each cycle. 

Saxon and Bertozzi (2000) developed an elegant and 
25 highly specific coupling chemistry linking a specific 

group that contains a phosphine moiety to an azido group 
on the surface of a biological cell. In the present 
application, this coupling chemistry is adopted to 
create a solid surface which is coated with a covalently 
30 linked phosphine moiety, and to generate polymerase 

chain reaction (PCR) products that contain an azido 
group at the 5 1 end for specific coupling of the DNA 
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template with the solid surface. One example of a solid 
surface is glass channels which have an inner wall with 
an uneven or porous surface to increase the surface 
area. Another example is a chip. 



The present application discloses a novel and 
advantageous system for DNA sequencing by the synthesis 
approach which employs a stable DNA template, which is 
able to self prime for. the polymerase reaction, 

10 covalently linked to a solid surface such as a chip, and 

4 unique nucleotides analogues (3'-ro-A- LABEL i, 3 <-r 0 -C- label2 , 
3'-ro" g ~label3/ 3' -ro~T-la B el4 ) . The success of this novel 
system will allow the development of an ultra high- 
throughput and high fidelity DNA sequencing system for 

15 polymorphism, pharmacogenetics applications and for 

whole genome sequencing. This fast and accurate DNA 
resequencing system is needed in such fields as 
detection of single nucleotide polymorphisms (SNPs) 
(Chee et al. 1996), serial analysis of gene expression 

20 (SAGE) (Velculescu et al . 1995), identification in 

forensics, and genetic disease association studies. 



5 
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Summary Of The Invention 

This invention is directed to a method for sequencing a 
nucleic acid by detecting the identity of a nucleotide 
5 analogue after the nucleotide analogue is incorporated 

into a growing strand of DNA in a polymerase reaction, 
which comprises the following steps: 

(i) attaching a 5' end of the nucleic acid to a 
10 solid surface; 

(ii) attaching a primer to the nucleic acid 
attached to the solid surface; 

15 (iii) adding a polymerase and one or more different 

nucleotide analogues to the nucleic acid to 
thereby incorporate a nucleotide analogue into 
the growing strand of DNA, wherein the 
incorporated nucleotide analogue terminates 

20 the polymerase reaction and wherein each 

different nucleotide analogue comprises (a) a 
base selected from the group consisting of 
adenine, guanine, cytosine, thymine, and 
uracil, and their analogues; (b) a unique 

25 label attached through a cleavable linker to 

the base or to an analogue of the base; (c) a 
deoxyribose; and (d) a cleavable chemical 
group to cap an -OH group at a 3' -position of 
the deoxyribose; 



(iv) washing the solid surface to remove 
unincorporated nucleotide analogues; 
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(v) detecting the unique label attached to the 
nucleotide analogue that has been incorporated 
into the growing strand of DNA, so as to 

5 thereby identify the incorporated nucleotide 

analogue; 

(vi) adding one or more chemical compounds to 
permanently .cap any unreacted -OH group on the 

10 primer attached to the nucleic acid or on a 

primer extension strand formed by adding one 
or more nucleotides or nucleotide analogues to 
the primer ; 

15 (vii) cleaving the cleavable linker between the 

nucleotide analogue that was incorporated into 
the growing strand of DNA and the unique 
label; 



20 (viii) cleaving the cleavable chemical group 

capping the -OH group at the 3' -position of 
the deoxyribose to uncap the -OH group, and 
washing the solid surface to remove cleaved 
compounds; and 

25 

(ix) repeating steps (iii) through (viii) so as to 
detect the identity of a newly incorporated 
nucleotide analogue into the growing strand of 
DNA; 



30 
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wherein if the unique label is a dye, the order of 
steps (v) through (vii) is: (v) , (vi) , and (vii); 
and 

5 wherein if the unique label is a mass tag, the 

order of steps (v) through (vii) is: (vi) , (vii), 
and (v) . 



The invention provides a method of attaching a nucleic 
10 acid to a solid surface which comprises: 

(i) coating the solid surface with a phosphine 
moiety, 

15 (ii) attaching an azido group to a 5' end of the 

nucleic acid, and 



(iii) immobilizing the 5' end of the nucleic acid 
to the solid surface through interaction 
20 between the phosphine moiety on the solid 

surface and the azido group on the 5' end of 
the nucleic acid. 



The invention provides a nucleotide analogue which 
25 comprises: 

(a) a base selected from the group consisting of 
adenine or an analogue of adenine, cytosine or 
an analogue of cytosine, guanine or an 
30 analogue of guanine, thymine or an analogue of 

thymine, and uracil or an analogue of uracil; 
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(b) a unique label attached through a cleavable 
linker to the base or to an analogue of the 
base; 

5 (c) a deoxyribose; and 

(d) a cleavable chemical group to cap an -OH group 
at a 3' -position of the deoxyribose. 

10 The invention provides a parallel mass spectrometry 

system, which comprises a plurality of atmospheric 
pressure chemical ionization mass spectrometers for 
parallel analysis of a plurality of samples comprising 
mass tags. 
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Brief Description Of The Figures 



10 



20 



Figure 1: The 3D structure of the ternary complexes of 
rat DNA polymerase, a DNA template-primer, and 
dideoxycytidine triphosphate (ddCTP) . The left side of 
the illustration shows the mechanism for the addition of 
ddCTP and the right side of the illustration shows the 
active site of the polymerase. Note that the 3' 
position of the dideoxyribose ring is very crowded, 
while ample space is available at the 5 position of the 
cytidine base. 



Figure 2A-2B: Scheme of sequencing by the synthesis 
approach. A: Example where the unique labels are dyes 
15 and the solid surface is a chip. B: Example where the 

unique labels are mass tags and the solid surface is 
channels etched into a glass chip. A, C, G, T; 
nucleotide triphosphates comprising bases adenine, 
cytosine, guanine, and thymine; d, deoxy; dd, dideoxy; 
R, cleavable chemical group used to cap the -OH group; 
Y, cleavable linker. 



Figure 3: The synthetic scheme for the immobilization of 
an azido (N 3 ) labeled DNA fragment to a solid surface 
coated with a triarylphosphine moiety. Me, methyl group; 
P, phosphorus; Ph, phenyl. 



Figure 4: The synthesis of triarylphosphine N- 
hydroxysuccinimide (NHS) ester. 

Figure 5: The synthetic scheme for attaching an azido 
(N 3 ) group through a linker to the 5' end of a DNA 
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fragment, which is then used to couple with the 
triarylphosphine moiety on a solid surface. DMSO, 
dimethylsulf onyl oxide. 

5 Figure 6A-6B: Ligate the looped primer (B) to the 

immobilized single stranded DNA template forming a self 

primed DNA template moiety on a solid surface. P (in 
circle) , phosphate. 

10 Figure 7: Examples of structures of four nucleotide 

analogues for use in the sequencing by synthesis 
approach. Each nucleotide analogue has a unique 

fluorescent dye attached to the base through a 
photocleavable linker and the 3 '-OH is either exposed or 

15 capped with a MOM group or an allyl group. FAM, 5- 

carboxyf luorescein; R6G, 6-carboxyrhodamine-6G; TAM, 
N, N, N 1 , N ? -tetramethyl-6-carboxyrhodamine; ROX, 6- 

carboxy-X-rhodamine . R = H, CH 2 OCH 3 (MOM) or CH 2 CH=CH 2 
(Allyl) . 

20 

Figure 8: A representative scheme for the synthesis of 
the nucleotide analogue 3'-Ro-G- Ta m- A similar scheme can 
be used to create the other three modified nucleotides: 

3'-RO~A-Dyelr 3' -RO~C~Dye2 r 3' -RO"T- Dye 4 .. (i) 

25 tetrakis (triphenylphosphine) palladium (0) ; (ii) P0C1 3 , 

Bn 4 N + pyrophosphate; (iii) NH4OH; (iv) Na 2 C0 3 /NaHC0 3 (pH = 
9.0) /DMSO. 

3 0 Figure 9: A scheme for testing the sequencing by 

synthesis approach. Each nucleotide, modified by the 
attachment of a unique fluorescent dye, is added one by 
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one, based on the complimentary template. The dye is 
detected and cleaved to test the approach. Dyel = Fam; 
Dye2 = R6G; Dye3 = Tam; Dye4 = Rox. 



5 Figure 10: The expected photocleavage products of DNA 

containing a photo-cleavable dye (Tam) . Light 
absorption (300 - 360 nm) by the aromatic 2-nitrobenzyl 
moiety causes reduction of the 2-nitro group to a 
nitroso group and an oxygen insertion into the carbon- 
10 hydrogen bond located in the 2-position followed by 

cleavage and decarboxylation (Pillai 1980) . 

Figure 11: Synthesis of PC-LC-Biotin-FAM to evaluate the 
photolysis efficiency of the fluorophore coupled with 
15 the photocleavable linker 2-nitrobenzyl group. 



Figure 12: Fluorescence spectra [X eK = 480 nm) of PC-LC- 
Biotin-FAM immobilized on a microscope glass slide 
coated with streptavidin (a) ; after 10 min photolysis 
20 (A, irr = 350 nm; -0.5 mW/cm 2 ) (b) ; and after washing with 

water to remove the photocleaved dye (c) . 

Figure 13A-13B: Synthetic scheme for capping the 3' -OH 
of nucleotide. 

25 

Figure 14: Chemical cleavage of the MOM group (top row) 
and the allyl group (bottom row) to free the 3' -OH in 
the nucleotide. CITMS = chlorotrimethylsilane . 

30 Figure 15A-15B: Examples of energy transfer coupled dye 

systems, where Fam or Cy2 is employed as a light 
absorber (energy transfer donor) and Cl 2 Fam, C1 2 R6G, 
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Cl 2 Tam, or Cl 2 Rox as an energy transfer acceptor. Cy2, 
cyanine; FAM, 5-carboxyf luorescein; R6G, 6- 

carboxyrhodamine-6G; TAM, N,N,N f , N 1 -tetramethyl-6- 

carboxyrhodamine; ROX, 6-carboxy-X-rhodamine . 



Figure 16: The synthesis of a photocleavable energy 
transfer dye-labeled nucleotide. DMF, dimethylf ormide . 
DEC = 1- (3-dimethylaminopropyl) -3-ethylcarbodimide 
hydrochloride. R = H, CH 2 0CH 3 (MOM) or CH 2 CH=CH 2 (Allyl) . 



10 



Figure 17: Structures of four mass tag precursors and 
four photoactive mass tags. Precursors: a) acetophenone; 
b) 3-f luoroacetophenone; c) 3, 4-dif luoroacetophenone; 
15 and d) 3, 4-dimethoxyacetophenone . Four photoactive mass 

tags are used to code for the identity of each of the 
four nucleotides (A, C, G, T) . 



Figure 18: Atmospheric Pressure Chemical Ionization 
20 (APCI) mass spectrum of mass tag precursors shown in 

Figure 17. 



Figure 19: Examples of structures of four nucleotide 
analogues for use in the sequencing by synthesis 

25 approach. Each nucleotide analogue has a unique mass tag 

attached to the base through a photocleavable linker, 
and the 3 ' -OH is either exposed or capped with a MOM 
group or an allyl group. The square brackets indicated 
that the mass tag is cleavable. R = H, CH 2 OCH 3 (MOM) or 

3 0 CH 2 CH=CH 2 (Allyl) . 
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Figure 20: Example of synthesis of NHS ester of one 
mass tag (Tag-3) . A similar scheme is used to create 
other mass tags. 

5 

Figure 21: A representative scheme for the synthesis of 
the nucleotide analogue 3'-Ro-G- Tag 3 . A similar scheme is 
used to create the other three modified bases 3'-R 0 -A- Tag i, 

3'-R0-C- T ag2/ 3' -R0-T- Tag4 . (i) 

10 tetrakis (triphenylphosphine) palladium (0) ; (ii) POCl 3 , 

Bn 4 N + pyrophosphate; (iii) NH 4 OH; (iv) Na 2 C0 3 /NaHC0 3 (pH = 
9.0)/DMSO. 



Figure 22: Examples of expected photocleavage products 
15 of DNA containing a photocleavable mass tag. 

Figure 23: System for DNA sequencing comprising 

multiple channels in parallel and multiple mass 
spectrometers in parallel. The example shows 96 channels 
20 in a silica glass chip. 



Figure 24: Parallel mass spectrometry system for DNA 
sequencing. Example shows three mass spectrometers in 
parallel. Samples are injected into the ion source 

25 where they are mixed with a nebulizer gas and ionized. A 

turbo pump is used to continuously sweep away free 
radicals, neutral compounds and other undesirable 
elements coming, from the ion source. A second turbo 
pump is used to generate a continuous vacuum in all 

30 three analyzers and detectors simultaneously. The 

acquired signal is then converted to a digital signal by 
the A/D converter. All three signals are then sent to 
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the data acquisition processor to convert the signal to 
identify the mass tag in the injected sample and thus 
identify the nucleotide sequence. 



5 
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Detailed Description Of The Invention 

The following definitions are presented as an aid in 
understanding this invention. 

5 

As used herein, to cap an -OH group means to replace the 
"H" in the -OH group with a chemical group. As 
disclosed herein, the -OH group of the nucleotide 
analogue is capped with a cleavable chemical group. To 
10 uncap an -OH group means to cleave the chemical group 

from a capped -OH group and to replace the chemical 
group with X> H", i.e., to replace the "R" in -OR with "H" 
wherein "R" is the chemical group used to cap the -OH 
group. 

15 

The nucleotide bases are abbreviated as follows: adenine 
(A), cytosine (C) , guanine (G) , thymine (T) , and uracil 
(0) . 

20 An analogue of a nucleotide base refers to a structural 

and functional derivative of the base of a nucleotide 
which can be recognized by polymerase as a substrate. 
That is, for example, an analogue of adenine (A) should 
form hydrogen bonds with thymine (T) , a C analogue 

25 should form hydrogen bonds with G, a G analogue should 

form hydrogen bonds with C, and a T analogue should 
form hydrogen bonds with A, in a double helix format. 
Examples of analogues of nucleotide bases include, but 
are not limited to, 7-deaza-adenine and 7-deaza-guanine, 

30 wherein the nitrogen atom at the 7-position of adenine 

or guanine is substituted with a carbon atom. 




WO 02/29003 ^PpCT/USOI/31243 



-20- 

A nucleotide analogue refers to a chemical compound that 
is structurally and functionally similar to the 
nucleotide, i.e. the nucleotide analogue can be 
recognized by polymerase as a substrate. That is, for 
5 example, a nucleotide analogue comprising adenine or an 

analogue of adenine should form hydrogen bonds with 
thymine, a nucleotide analogue comprising C or an 
analogue of C should form hydrogen bonds with G, a 
nucleotide analogue comprising G or an analogue of G 
10 should form hydrogen bonds with C, and a nucleotide 

analogue comprising T or an analogue of T should form 
hydrogen bonds with A, in a double helix format. 
Examples of nucleotide analogues disclosed herein 
include analogues which comprise an analogue of the 
15 nucleotide base such as 7-deaza-adenine or 7-deaza- 

guanine, wherein the nitrogen atom at the 7-position of 
adenine or guanine is substituted with a carbon atom. 
Further examples include analogues in which a label is 
attached through a cleavable linker to the 5-position of 
20 cytosine or thymine or to the 7-position of deaza- 

adenine or deaza-guanine . Other examples include 
analogues in which a small chemical moiety such as - 
CH 2 OCH 3 or -CH 2 CH=CH 2 is used to cap the -OH group at the 
3' -position of deoxyribose. Analogues of 

25 dideoxynucleotides can similarly be prepared. 



As used herein, a porous surface is a surface which 
contains pores or is otherwise uneven, such that the 
surface area of the porous surface is increased relative 
30 to the surface area when the surface is smooth. 
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The present invention is directed to a method for 
sequencing a nucleic acid by detecting the identity of a 
nucleotide analogue after the nucleotide analogue is 
incorporated into a growing strand of DNA in a 
polymerase reaction, which comprises the following 
steps : 

(i) attaching a 5' end of the nucleic acid to a 
solid surface; 

(ii) attaching a primer to the nucleic acid 
attached to the solid surface; 

(iii) adding a polymerase and one or more different 
nucleotide analogues to the nucleic acid to 
thereby incorporate a nucleotide analogue into 
the growing strand of DNA, wherein the 
incorporated nucleotide analogue terminates 
the polymerase reaction and wherein each 
different nucleotide analogue comprises (a) a 
base selected from the group consisting of 
adenine, guanine, cytosine, thymine, and 
uracil, and their analogues; (b) a unique 
label attached through a cleavable linker to 
the base or to an analogue of the base; (c) a 
deoxyribose; and (d) a cleavable chemical 
group to cap an -OH group at a 3' -position of 
the deoxyribose; 

(iv) washing the solid surface to remove 
unincorporated nucleotide analogues; 
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15 



20 



(v) detecting the unique label attached to the 
nucleotide analogue that has been incorporated 
into the growing strand of DNA, so as to 
thereby identify the incorporated nucleotide 
analogue; 

(vi) adding one or more chemical compounds to 
permanently cap any unreacted -OH group on the 
primer attached to the nucleic acid or on a 
primer extension strand formed by adding one 
or more nucleotides or nucleotide analogues to 
the primer; 

(vii) cleaving the cleavable linker between the 
nucleotide analogue that was incorporated into 
the growing strand of DNA and the unique 
label; 

(viii) cleaving the cleavable chemical group 
capping the -OH group at the 3' -position of 
the deoxyribose to uncap the -OH group, and 
washing the solid surface to remove cleaved 
compounds ; and 

(ix) repeating steps (iii) through (viii) so as to 
detect the identity of a newly incorporated 
nucleotide analogue into the growing strand of 



DNA; 



30 



wherein if the unique label is a dye, the order of 
steps (v) through (vii) is: (v) , (vi), and (vii); 
and 
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wherein if the unique label is a mass tag, the 
order of steps (v) through (vii) is: (vi) , (vii), 
and (v) . 

5 

In one embodiment of any of the nucleotide analogues 
described herein, the nucleotide base is adenine. In one 
embodiment, the nucleotide base is guanine. In one 
embodiment, the nucleotide base is cytosine. In one 

10 embodiment, the nucleotide base is thymine. In one 

embodiment, the nucleotide base is uracil. In one 
embodiment, the nucleotide base is an analogue of 
adenine. In one embodiment, the nucleotide base is an 
analogue of guanine. In one embodiment, the nucleotide 

15 base is an analogue of cytosine. In one embodiment, the 

nucleotide base is an analogue of thymine. In one 
embodiment, the nucleotide base is an analogue of 
uracil . 

20 In different embodiments of any of the inventions 

described herein, the solid surface is glass, silicon, 
or gold. In different embodiments, the solid surface is 
a magnetic bead, a chip, a channel in a chip, or a 
porous channel in a chip. In one embodiment, the solid 

25 surface is glass. In one embodiment, the solid surface 

is silicon. In one embodiment, the solid surface is 
gold. In one embodiments, the solid surface is a 
magnetic bead. In one embodiment, the solid surface is a 
chip. In one embodiment, the solid surface is a channel 

30 in a chip. In one embodiment, the solid surface is a 

porous channel in a chip. Other materials can also be 
used as long as the material does not interfere with the 
steps of the method. 
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In one embodiment, the step of attaching the nucleic 
acid to the solid surface comprises: 

(i) coating the solid surface with a phosphine 
5 moiety, 

(ii) attaching an azido group to the 5' end of the 
nucleic acid, and 



10 (iii) immobilizing the 5' end of the nucleic acid to 

the solid surface through interaction between 
the phosphine moiety on the solid surface and 
the azido group on the 5' end of the nucleic 
acid. 



15 



20 



In one embodiment, the step of coating the solid surface 
with the phosphine moiety comprises: 

(i) coating the surface with a primary amine, and 

(ii) covalently coupling a N-hydroxysuccinimidyl 
ester of triarylphosphine with the primary 
amine . 



25 In one embodiment, the nucleic acid that is attached to 

the solid surface is a single -stranded deoxyribonucleic 
acid (DNA) . In another embodiment, the nucleic acid 
that is attached to the solid surface in step (i) is a 
double-stranded DNA, wherein only one strand is directly 

30 attached to the solid surface, and wherein the strand 

that is not directly attached to the solid surface is 
removed by denaturing before proceeding to step (ii) . 
In one embodiment, the nucleic acid that is attached to 
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the solid surface is a ribonucleic acid (RNA) , and the 
polymerase in step (iii) is reverse transcriptase. 



In one embodiment, the primer is attached to a 3' end of 
the nucleic acid in step (ii) , and the attached primer 
comprises a stable loop and an -OH group at a 3'^ 
position of a deoxyribose capable of self-priming in the 
polymerase reaction. In one embodiment, the step of 
attaching the primer to the nucleic acid comprises 
hybridizing the primer to the nucleic acid or ligating 
the primer to the nucleic acid. In one embodiment, the 
primer is attached to the nucleic acid through a 
ligation reaction which links the 3' end of the nucleic 
acid with the 5' end of the primer. 

In one embodiment, one or more of four different 
nucleotide analogs is added in step (iii), wherein each 
different nucleotide analogue comprises a different base 
selected from the group consisting of thymine or uracil 
or an analogue of thymine or uracil, adenine or an 
analogue of adenine, cytosine or an analogue of 
cytosine, and guanine or an analogue of guanine, and 
wherein each of the four different nucleotide analogues 
comprises a unique label. 

In one embodiment, the cleavable chemical group that 
caps the -OH group at the 3 ' -position of the deoxyribose 
in the nucleotide analogue is -CH 2 OCH 3 or -CH 2 CH=CH 2 . Any 
chemical group could be used as long as the group 1) is 
stable during the polymerase reaction, 2) does not 
interfere with the recognition of the nucleotide 
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analogue by polymerase as a substrate, and 3) is 
cleavable . 

In one embodiment, the unique label that is attached to 
5 the nucleotide analogue is a fluorescent moiety or a 

fluorescent semiconductor crystal. In further 

embodiments, the fluorescent moiety is selected from the 
group consisting of 5-carboxyf luorescein, 6- 
carboxyrhodamine-6G, N,N,N' ,N f -tetramethyl-6- 

10 carboxyrhodamine, and 6-carboxy-X-rhodamine . In one 

embodiment, the fluorescent moiety is 5- 
carboxyf luorescein. In one embodiment, the fluorescent 
moiety is 6-carboxyrhodamine-6G, N, N, N ! , N 1 -tetramethyl- 
6-carboxyrhodamine. In one embodiment, the fluorescent 

15 moiety is 6-carboxy-X-rhodamine . 

In one embodiment, the unique label that is attached to 
the nucleotide analogue is a fluorescence energy- 
transfer tag which comprises an energy transfer donor 

2 0 and an ■ energy transfer acceptor. In further 

embodiments, the energy transfer donor is 5- 
carboxyf luorescein or cyanine, and wherein the energy 
transfer acceptor is selected from the group consisting 
of dichlorocarboxyf luorescein , dichloro- 6 - 

25 carboxyrhodamine -6G, dichloro-N,N,N ! ,N' -tetramethyl-6- 

carboxyrhodamine , and dichloro-6 -carboxy-X-rhodamine . 
' In one embodiment , the energy transfer acceptor is 
dichlorocarboxyf luorescein. In one embodiment, the 
energy transfer acceptor is dichloro-6 -carboxyrhodamine- 

3 0 6G. In one embodiment, the energy transfer acceptor is 

dichloro-N,N,N ! ,N ! -tetramethyl-6 -carboxyrhodamine . In 
one embodiment, the energy transfer acceptor is 
di chloro - 6 - c arboxy - X - r hodamine . 
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In one embodiment, the unique label that is attached to 
the nucleotide analogue is a mass tag that can be 
detected and differentiated by a mass spectrometer. In 
5 further embodiments, the mass tag is selected from the 

group consisting of a 2-nitro-a-methyl-benzyl group, a 
2-nitro-a-methyl~3-f luorobenzyl group, a 2-nitro-a- 
methyl-3, 4-dif luorobenzyl group, and a 2-nitro-a-methyl- 
3, 4-dimethoxybenzyl group. In one embodiment, the mass 

10 tag is a 2-nitro-Qt-methyl-benzyl group. In one 

embodiment, the mass tag is a 2-nitro-a-methyl-3- 
f luorobenzyl group. In one embodiment, the mass tag is a 
2-nitro-a-methyl~3, 4-dif luorobenzyl group. In one 
embodiment, the mass tag is a 2-nitro-a-methyl-3, 4- 

15 dimethoxybenzyl group. In one embodiment, the mass tag 

j s detected using a parallel mass spectrometry system 
which comprises a plurality of atmospheric pressure 
chemical ionization mass spectrometers for parallel 
analysis of a plurality of samples comprising mass tags. 

20 

In one embodiment, the unique label is attached through 
a cleavable linker to a 5-position of cytosine or 
thymine or to a 7-position of deaza-adenine or deaza- 
guanine. The unique label could also be attached 

25 through a cleavable linker to another position in the 

nucleotide analogue as long as the attachment of the 
label is stable during the polymerase reaction and the 
nucleotide analog can be recognized by polymerase as a 
substrate. For example, the cleavable label could be 

30 attached to the deoxyribose. 
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In one embodiment, the linker between the unique label 
and the nucleotide analogue is cleaved by a means 
selected from the group consisting of one or more of a 
physical means, a chemical means, a physical chemical 
5 means, heat f and light. In one embodiment, the linker is 

cleaved by a physical means. In one embodiment, the 
linker is cleaved by a chemical means. In one 
embodiment, the linker is cleaved by a physical chemical 
means. In one embodiment, the linker is cleaved by heat. 
10 In one embodiment, the linker is cleaved by light. In 

one embodiment, the linker is cleaved by ultraviolet 
light. In a further embodiment, the cleavable linker is 
a photocleavable linker which comprises a 2-nitrobenzyl 
moiety. 

15 

In one embodiment, the cleavable chemical group used to 
cap the -OH group at the 3' -position of the deoxyribose 
is cleaved by a means selected from the group consisting 
of one or more of a physical means, a chemical means, a 

20 physical chemical means, heat, and light. In one 

embodiment, the linker is cleaved by a physical chemical 
means. In one embodiment, the linker is cleaved by heat. 
In one embodiment, the linker is cleaved by light. In 
one embodiment, the linker is cleaved by ultraviolet 

25 light. 



In one embodiment, the chemical compounds added in step 
(vi) to permanently cap any unreacted -OH group on the 
primer attached to the nucleic acid or on the primer 
30 extension strand are a polymerase and one or more 

different dideoxynucleotides or analogues of 
dideoxynucleotides . In further embodiments, the 
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different dideoxynucleotides are selected from the group 
consisting of 2' , 3' -dideoxyadenosine 5' -triphosphate, 
2' , 3' -dideoxyguanosine 5' -triphosphate, 2' , 3' - 

dideoxycytidine 5' -triphosphate, 2' , 3' -dideoxythymidine 
5 5' -triphosphate, 2' , 3' -dideoxyuridine 5' -triphosphase, 

and their analogues. In one embodiment, the 

dideoxynucleotide is 2 ' , 3 ' -dideoxyadenosine 5 ' - 
triphosphate. In one embodiment, the dideoxynucleotide 
is 2' , 3' -dideoxyguanosine 5' -triphosphate . In one 

10 embodiment, the dideoxynucleotide is 2', 3'- 

dideoxycytidine 5' -triphosphate . In one embodiment, the 
dideoxynucleotide is 2' , 3' -dideoxythymidine 5'- 
triphosphate . In one embodiment, the dideoxynucleotide 
is 2' , 3' -dideoxyuridine 5' -triphosphase . In one 

15 embodiment, the dideoxynucleotide is an analogue of 

2 ' , 3 ' -dideoxyadenosine 5 ' -triphosphate . In one 

embodiment, the dideoxynucleotide is an analogue of 
2 ' , 3 ' -dideoxyguanosine 5 ' -triphosphate . In one 

embodiment, the dideoxynucleotide is an analogue of 

20 2' , 3' -dideoxycytidine 5 f -triphosphate . In one 

embodiment, the dideoxynucleotide is an analogue of 
2' , 3' -dideoxythymidine 5' -triphosphate . In one 

embodiment, the dideoxynucleotide is an analogue of 
2 7 , 3' -dideoxyuridine 5' -triphosphase. 

25 

In one embodiment, a polymerase and one or more of four 
different dideoxynucleotides are added in step (vi), 
wherein each different dideoxynucleotide is selected 
from the' group consisting of 2' , 3' -dideoxyadenosine 5'- 
30 triphosphate or an analogue of 2' , 3' -dideoxyadenosine 

5' -triphosphate; 2' , 3' -dideoxyguanosine 5' -triphosphate 
or an analogue of 2' , 3' -dideoxyguanosine 5'- 
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triphosphate; 2' , 3' -dideoxycytidine 5' -triphosphate or 
an analogue of 2' , 3' -dideoxycytidine 5' -triphosphate; 
and 2' , 3' -dideoxythymidine 5 ' -triphosphate or 2',3'- 
dideoxyuridine 5' -triphosphase or an analogue of 2', 3'- 
5 dideoxythymidine 5' -triphosphate or an analogue of 

2' , 3 r -dideoxyuridine 5' -triphosphase . In one embodiment, 
the dideoxynucleotide is 2' , 3' -dideoxyadenosine 5'- 
triphosphate . In one embodiment , the dideoxynucleotide 
is an analogue of 2' , 3' -dideoxyadenosine 5'- 

10 triphosphate. In one embodiment , the dideoxynucleotide 

is 2' , 3' -dideoxyguanosine 5' -triphosphate . In one 
embodiment, the dideoxynucleotide is an analogue of 
2' , 3' -dideoxyguanosine 5' -triphosphate . In one 

embodiment, the dideoxynucleotide is 2',3'- 

15 dideoxycytidine 5' -triphosphate . In one embodiment, the 

dideoxynucleotide is an analogue of 2',3'- 
dideoxycytidine 5' -triphosphate . In one embodiment, the 
dideoxynucleotide is 2 ' , 3 ' -dideoxythymidine 5 ' - 
triphosphate- In one embodiment, the dideoxynucleotide 

20 is 2' , 3 7 -dideoxyuridine 5' -triphosphase . In one 

embodiment, the dideoxynucleotide is an analogue of 
2 ' , 3 ' -dideoxythymidine 5 ' -triphosphate . In one 

embodiment, the dideoxynucleotide is an analogue of 
2' , 3' -dideoxyuridine 5' -triphosphase . 

25 

Another type of chemical compound that reacts 
specifically with the -OH group could also be used to 
permanently cap any unreacted -OH group on the primer 
attached to the nucleic acid or on an extension strand 
30 formed by adding one or more nucleotides or nucleotide 

analogues to the primer. 
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The invention provides a method for simultaneously 
sequencing a plurality of different nucleic acids, which 
comprises simultaneously applying any of the methods 
disclosed herein for sequencing a nucleic acid to the 
plurality of different nucleic acids. In different 
embodiments, the method can be used to sequence from one 
to over 100,000 different nucleic acids simultaneously. 

The invention provides for the use of any of the methods 
disclosed herein for detection of single nucleotide 
polymorphisms, genetic mutation analysis, serial 
analysis of gene expression, gene expression analysis, 
identification in forensics, genetic disease association 
studies, DNA sequencing, genomic sequencing, 
translational analysis, or transcriptional analysis. 

The invention provides a method of attaching a nucleic 
acid to a solid surface which comprises: 

(i) coating the solid surface with a phosphine 
moiety, 

(ii) attaching an azido group to a 5' end of the 
nucleic acid, and 

(iii) immobilizing the 5' end of the nucleic acid 
to the solid surface through interaction 
between the phosphine moiety on the solid 
surface and the azido group on the 5' end of 
the nucleic acid. 
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In one embodiment, the step of coating the solid surface 
with the phosphine moiety comprises: 

(i) coating the surface with a primary amine, and 

5 

(ii) covalently coupling a N-hydroxysuccinimidyl 
ester of triarylphosphine with the primary 
amine . 

10 In different embodiments, the solid surface is glass, 

silicon, or gold. In different embodiments, the solid 
surface is a magnetic bead, a chip, a channel in an 
chip, or a porous channel in a chip. 

In different embodiments, the nucleic acid that is 
attached to the solid surface is a single-stranded or 
double-stranded DNA or a RNA. In one embodiment, the 
nucleic acid is a double-stranded DNA and only one 
strand is attached to the solid surface. In a further 
embodiment, the strand of the double-stranded DNA that 
is not attached to the solid surface is removed by 
denaturing . 

The invention provides for the use of any of the methods 
disclosed herein for attaching a nucleic acid to a 
surface for gene expression analysis, microarray based 
gene expression analysis, or mutation detection, 
translational analysis, transcriptional analysis, or for 
other genetic applications. 



15 



20 



25 



30 



The invention provides a nucleotide analogue which 
comprises : 
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15 



20 



25 



(a) a base selected from the group consisting of 
adenine or an analogue of adenine, cytosine or 
an analogue of cytosine, guanine or an 
analogue of guanine, thymine or an analogue of 
thymine, and uracil or an analogue of uracil; 

(b) a unique label attached through a cleavable 
linker to the base or to an analogue of the 
base; 

(c) a deoxyribose; and 

(d) a cleavable chemical group to cap an -OH group 
at a 3 '-position of the deoxyribose. 

In one embodiment of the nucleotide analogue, the 
cleavable chemical group that caps the -OH group at the 
3' -position of the deoxyribose is -CH2OCH3 or 
— CH2CH=CH2 • 

In one embodiment, the unique label is a fluorescent 
moiety or a fluorescent semiconductor crystal. In 
further embodiments, the fluorescent moiety is selected 
from the group consisting of 5-carboxyf luorescein, 6- 
carboxyrhodamine-6G, N, N, N 1 , N 1 -tetramethyl-6- 

carboxyrhodamine, and 6-carboxy-X-rhodamine . 

In one embodiment, the unique label is a fluorescence 
energy transfer tag which comprises an energy transfer 
donor and an energy transfer acceptor. In further 
embodiments, the energy transfer donor is 5- 
carboxyf luorescein or cyanine, and wherein the energy 
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transfer acceptor is selected from the group consisting 
of dichlorocarboxyf luorescein , dichloro-6 - 

carboxyrhodamine - 6G , dichloro -N , N , N ' , N f - 1 e t ramethy 1 - 6 - 
carboxyrhodamine, and dichloro- 6 -carboxy-X-rhodamine . 

5 

In one embodiment, the unique label is a mass tag that 
can be detected and differentiated by a mass 
spectrometer. In further embodiments, the mass tag is 
selected from the group consisting of a 2-nitro-a- 
10 methyl-benzyl group, a 2-nitro-a-methyl-3-f luorobenzyl 

group, a 2-nitro-a-methyl-3, 4-dif luorobenzyl group, and 
a 2-nitro-a-methyl-3, 4-dimethoxybenzyl group. 



In one embodiment, the unique label is attached through 
15 a cleavable linker to a 5-position of cytosine or 

thymine or to a 7-position of deaza-adenine or deaza- 
guanine. The unique label could also be attached 
through a cleavable linker to another position in the 
nucleotide analogue as long as the attachment of the 
20 label is stable during the polymerase reaction and the 

nucleotide analog can be recognized by polymerase as a 
substrate. For example, the cleavable label could be 
attached to the deoxyribose. 

25 In one embodiment, the linker between the unique label 

and the nucleotide analogue is cleavable by a means 
selected from the group consisting of one or more of a 
physical means, a chemical means, a physical chemical 
means, heat, and light. In a .further embodiment, the 

30 cleavable linker is a photocleavable linker which 

comprises a 2-nitrobenzyl moiety. 
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In one embodiment, the cleavable chemical group used to 
cap the -OH group at the 3' -position of the deoxyribose 
is cleavable by a means selected from the group 
consisting of one or more of a physical means, a 
5 chemical means, a physical chemical means, heat, and 

light. 
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In different embodiments, the nucleotide analogue is 
selected from the group consisting of: 





30 



wherein R is a cleavable chemical group used to cap 
the -OH group at the 3' -position of the 
deoxyribose . 
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In different embodiments, the nucleotide analogue 
selected from the group consisting of: 

HO- 



o o o 
- 0 _{L-o-Jl-o-Jl-o 
or o- cr 



COOH 




10 



15 



20 



25 



H 3 CH 2 CHN 
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o o o o^Y^l^-\ 
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NHCH2CH3 



(H3Q2N 



OOO H 2 N 

•o-P-o-t^-o 
6- 6- 6- 



HN 




O — NO/ 




N(CH3>, 



COO* 



OR 



0 0 0 

t>4-0-*-0-T 

» • 
0~ O' 



OR 




30 



wherein R is -CH2OCH3 or -CH2CH=CH2- 
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In different embodiments, the nucleotide analogue is 
selected from the group consisting of: 



■o-P-o-M-o-V 



0 O ^N^V 

o- 6- 6- 

OR 



NH 2 



o o o °^ 
-0-^-0-^-0-^-0 

6- 6- 6 




O __ N CL 



O O O H 2 

- 0 _|J_o_|i-o-U-o 
6- 6- 6- 



hn^VC^ I ^ a 9 3 , and 



O O O 0*^N 

-o_}j_o_|}_o_}J-o 
6- 6- 6- 




OR 



wherein Tagi, Tag 2 , Tag3, and Tag 4 are four different 
mass tag labels; and 



30 



wherein R is a cleavable chemical group used to cap 
the -OH group at the 3' -position of the 
deoxyribose . 
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In different embodiments, the nucleotide analogue is 
selected from the group consisting of: 



10 



15 



20 



25 



NH 2 H . 



O O O T< V 
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O 0 O O^N-^ O 
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6- 6- 6- 



OR 




1 JM 

0 O O HjN^N-^N 
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6- 6- 6- 1 ^-°-^ 





30 



wherein R is -CH 2 0CH 3 or -CH 2 CH=CH 2 . 
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The invention provides for the use any of the nucleotide 
analogues disclosed herein for detection of single 
nucleotide polymorphisms, genetic mutation analysis, 
serial analysis of gene expression, gene expression 
■ 5 analysis, identification in forensics, genetic disease 

association studies, DNA sequencing, genomic sequencing, 
translational analysis, or transcriptional analysis. 



The invention provides a parallel mass spectrometry 

10 system, which comprises a plurality of atmospheric 

pressure chemical ionization mass spectrometers for 
parallel analysis of a plurality of samples comprising 
mass tags. In one embodiment, the mass spectrometers 
are quadrupole mass spectrometers. In one embodiment, 

15 the mass spectrometers are time-of-f light mass 

spectrometers. In one embodiment, the mass 

spectrometers are contained in one device. In one 
embodiment, the system further comprises two turbo- 
pumps, wherein one pump is used to generate a vacuum and 

20 a second pump is used to remove undesired elements. In 

one embodiment, the system comprises at least three mass 
spectrometers. In one embodiment, the mass tags have 
molecular weights between 150 daltons and 250 daltons. 
The invention provides for the use of the system for DNA 

25 sequencing analysis, detection of single nucleotide 

polymorphisms, genetic mutation analysis, serial 
analysis of gene expression, gene expression analysis, 
identification in forensics, genetic disease association 
studies, DNA sequencing, genomic sequencing, 

30 translational analysis, or transcriptional analysis. 
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This invention will be better understood from the 
Experimental Details which follow. However, one skilled 
in the art will readily appreciate that the specific 
methods and results discussed are merely illustrative of 
5 the invention as described more fully in the claims 

which follow thereafter. 
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Experimental Details 

1. The Sequencing by Synthesis Approach 

Sequencing DNA by synthesis involves the detection of 
5 the identity of each nucleotide as it is incorporated 

into the growing strand of DNA in the polymerase 
reaction. The fundamental requirements for such a 
system to work are: (1) the availability of 4 nucleotide 
analogues (aA, aC, aG, aT) each labeled with a unique 

10 label and containing a chemical moiety capping the 3 '-OH 

group; (2) the 4 nucleotide analogues (aA, aC, aG, aT) 
need to be efficiently and faithfully incorporated by 
DNA polymerase as terminators in the polymerase 
reaction; (3) the tag and the group capping the 3 1 -OH 

15 need to be removed with high yield to allow the 

incorporation and detection of the next nucleotide; and 
(4) the growing strand of DNA should survive the 
washing, detection and cleavage processes to remain 
annealed to the DNA template. 

20 

The sequencing by synthesis approach disclosed herein is 
illustrated in Figure 2A-2B. In Figure 2A, an example 
is shown where the unique labels are fluorescent dyes 
and the surface is a chip; in Figure 2B, the unique 
25 labels are mass tags and the surface is channels etched 

into a chip. The synthesis approach uses a solid 
surface such as a glass chip with an immobilized DNA 
template that is able to self prime for initiating the 
polymerase reaction, and four nucleotide analogues (3'_ R0 - 

30 A"" LABELl f S'-RO-C-labeI^/ 3' -RO — G~LABEL3 ' 3' -RO~T" LAB EL4 ) each 

labeled with a unique label, e.g. a fluorescent dye or a 
mass tag, at a specific location on the purine or 
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pyrimidine base, and a small cleavable chemical group 

(R) to cap the 3 1 -OH group. Upon adding the four 

nucleotide analogues and DNA polymerase, only one 

nucleotide analogue that is complementary to the next 

nucleotide on the template is incorporated by the 

polymerase on each spot of the surface (step 1 in Fig. 
2 A and 2B) . 



As shown in Figure 2A, where the unique labels are dyes, 
10 after removing the excess reagents and washing away any 

unincorporated nucleotide analogues on the chip, a 
detector is used to detect the unique label. For 
example, a four color fluorescence imager is used to 
image the surface of the chip, and the unique 
15 fluorescence emission from a specific dye on the 

nucleotide analogues on each spot of the chip will 
reveal the identity of the incorporated nucleotide (step 
2 in Fig. 2A) . After imaging, the small amount of 
unreacted 3' -OH group on the self-primed template moiety 
20 is capped by excess dideoxynucleoside triphosphates 

(ddNTPs) (ddATP, ddGTP, ddTTP, and ddCTP) and DNA 
polymerase to avoid interference with the next round of 
synthesis (step 3 in Fig. 2A) , a concept similar to the 
capping step in automated solid phase DNA synthesis 
25 (Caruthers, 1985). The ddNTPs, which lack a 3'-hydroxyl 

group, are chosen to cap the unreacted 3 1 -OH of the 
nucleotide due to their small size compared with the 
dye-labeled nucleotides, and the excellent efficiency 
with which they are incorporated by DNA polymerase. The 
30 dye moiety is then cleaved by light (-350 nm) , and the R 

group protecting the 3' -OH is removed chemically to 
generate free 3' -OH group with high yield (step 4 in 
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Fig. 2A) . A washing step is applied to wash away the 
cleaved dyes and the R group. The self-primed DNA 
moiety on the chip at this stage is ready for the next 
cycle of the reaction to identify the next nucleotide 
5 sequence of the template DNA (step 5 in Fig 2A) - 

It is a routine procedure now to immobilize high density 
(>10,000 spots per chip) single stranded DNA on a 4cm x 
lcm glass chip (Schena et al . 1995). Thus, in the DNA 
10 sequencing system disclosed herein, more than 10,000 

bases can be identified after each cycle and after 100 
cycles, a million base pairs will be generated from one 
sequencing chip. 

15 Possible DNA polymerases include Thermo Sequenase, Taq 

FS DNA polymerase, T7 DNA polymerase, and Vent (exo-) 
DNA polymerase. The fluorescence emission from each 
specific dye can be detected using a fluorimeter that is 
equipped with an accessory to detect fluorescence from a 

20 glass slide. For large scale evaluation, a multi-color 

scanning system capable of detecting multiple different 
fluorescent dyes (500 nm - 700 nm) (GSI Lumonics 
ScanArray 5000 Standard Biochip Scanning System) on a 
glass slide can be used. 

25 

An example of the sequencing by synthesis approach using 
mass tags is shown in Figure 2B. The approach uses a 
solid surface, such as .a porous silica glass channels in 
a chip, with immobilized DNA template that is able to 
30 self prime for initiating the polymerase reaction, and 

four nucleotide analogues (3'-R0-A- Tag i, 3'-Ro-C- Tag 2r 3'-ro~G- 
Tag3/ 3' -RO"T- Tag 4 ) each labeled with a unique photocleavable 
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mass tag on the specific location of the base, and a 
small cleavable chemical group (R) to cap the 3 ' -OH 
group. Upon adding the four nucleotide analogues and DNA 
polymerase, only one nucleotide analogue that is 
5 complementary to the next nucleotide on the template is 

incorporated by polymerase in each channel of the glass 
chip (step 1 in Fig. 2B) . After removing the excess 
reagents and washing away any unincorporated nucleotide 
analogues on the chip, the small amount of unreacted 3 1 - 

10 OH group on the self-primed template moiety is capped by 

excess ddNTPs (ddATP, ddGTP, ddTTP and ddCTP) and DNA 
polymerase to avoid interference with the next round of 
synthesis (step 2 in Fig. 2B) . The ddNTPs are chosen 
to cap the unreacted 3 f -OH of the nucleotide due to 

15 their small size compared with the labeled nucleotides, 

and their excellent efficiency to be incorporated by DNA 
polymerase. The mass tags are cleaved by irradiation 
with light (-350 nm) (step 3 in Fig. 2B) and then 
detected with a mass spectrometer. The unique mass of 

20 each tag yields the identity of the nucleotide in each 

channel (step 4 in Fig. 2B) . The R protecting group is 
then removed chemically and washed away to generate free 
3 ! -0H group with high yield (step 5 in Fig. 2B) . The 
self-primed DNA moiety on the chip at this stage is 

25 ready for the next cycle of the reaction to identify the 

next nucleotide sequence of the template DNA (step 6 in 
Fig. 2B) . 

Since the development of new ionization techniques such 
30 as matrix assisted laser desorption ionization (MALDI) 

and electrospray ionization (ESI), mass spectrometry has 
become an indispensable tool in many areas of biomedical 
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research. Though these ionization methods are suitable 
for the analysis of bioorganic molecules, such as 
peptides and proteins, improvements in both detection 
and sample preparation are required for implementation 
5 of mass spectrometry for DNA sequencing applications. 

Since the approach disclosed herein uses small and 
stable mass tags, there is no need to detect large DNA 
sequencing fragments directly and it is not necessary to 
use MALDI or ESI methods for detection. Atmospheric 

10 pressure chemical ionization (APCI) is an ionization 

method that uses a gas-phase ion-molecular reaction at 
atmospheric pressure (Dizidic et al. 1975) . In this 
method, samples are introduced by either chromatography 
or flow injection into a pneumatic nebulizer where they 

15 are converted into small droplets by a high-speed beam 

of nitrogen gas. When the heated gas and solution 
arrive at the reaction area, the excess amount of 
solvent is ionized by corona discharge ♦ This ionized 
mobile phase acts as the ionizing agent toward the 

20 samples and yields pseudo molecular (M+H) + and (M-H) ~ 

ions. Due to the corona discharge ionization method, 
high ionization efficiency is attainable, maintaining 
stable ionization conditions with detection sensitivity 
lower than femtomole region for small and stable organic 

25 compounds. However, due to the limited detection of 

large molecules, ESI and MALDI have replaced APCI for 
analysis of peptides and nucleic acids. Since in the 
approach disclosed the mass tags to be detected are 
relatively small and very stable organic molecules, the 

30 ability to detect large biological molecules gained by 

using ESI and MALDI is not necessary. APCI has several 
advantages over ESI and MALDI because it does not 



WO 02/29003 ^^PCT/USOl/31243 

-47- 

require any tedious sample preparation such as desalting 
or mixing with matrix to prepare crystals on a target 
plate. In ESI, the sample nature and sample preparation 
conditions (i.e. the existence of buffer or inorganic 
5 salts) suppress the ionization efficiency. MALDI 

requires the addition of matrix prior to sample 
introduction into the mass spectrometer and its speed is 
often limited by the need to search for an ideal 
irradiation spot to obtain interpretable mass spectra. 

10 These limitations are overcome by APCI because the mass 

tag solution can be injected directly with no additional 
sample purification or preparation into the mass 
spectrometer. Since the mass tagged samples are 

volatile and have small mass numbers, these compounds 

15 are easily detectable by APCI ionization with high 

sensitivity. This system can be scaled up into a high 
throughput operation. 



20 Each component of the sequencing by synthesis system is 

described in more detail below. 

2 . Construction of a Surface Containing Immobilized 
Self-primed DNA Moiety 

25 The single stranded DNA template immobilized on a 

surface is prepared according to the scheme shown in 
Figure 3. The surface can be, for example, a glass 
chip, such as a 4cm x 1cm glass chip, or channels in a 
glass chip. The surface is first treated with 0.5 M 

30 NaOH, washed with water, and then coated with high 

density 3-aminopropyltrimethoxysilane in aqueous ethanol 
(Woolley et al. 1994) forming a primary amine surface. 
N-Hydroxy Succinimidyl (NHS) ester of triarylphosphine 
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(1) is covalently coupled with the primary amine group 
converting the amine surface to a novel triarylphosphine 
surface, which specifically reacts with DNA containing 
an azido group (2) forming a chip with immobilized DNA. 
5 Since the azido group is only located at the 5' end of 

the DNA and the coupling reaction is through the unique 
reaction of the triarylphosphine moiety with the azido 
group in aqueous solution (Saxon and Bertozzi 2000) , 
such a DNA surface will provide an optimal condition for 
10 hybridization. 

The NHS ester of triarylphosphine (1) is prepared 
according to the scheme shown in Figure 4. 3- 
diphenylphosphino-4-methoxycarbonyl-benzoic acid (3) is 

15 prepared according to the procedure described by 

Bertozzi et al. (Saxon and Bertozzi 2000) . Treatment of 
(3) with N-Hydroxysuccinimide forms the corresponding 
NHS ester (4) . Coupling of (4) with an amino carboxylic 
acid moiety produces compound (5) that has a long linker 

20 (n = 1 to 10) for optimized coupling with DNA on the 

surface. Treatment of (5) with N-Hydroxysuccinimide 
generates the NHS ester (1) which is ready for coupling 
with the primary amine coated surface (Figure 3) . 

25 The azido labeled DNA (2) is synthesized according to 

the scheme shown in Figure 5. Treatment of ethyl ester 
of 5-bromovaleric acid with sodium azide and then 
hydrolysis produces 5-azidovaleric acid (Khoukhi et al., 
1987), which is subsequently converted to a NHS ester 

30 for coupling with an amino linker modified 

oligonucleotide primer. Using the azido-labeled primer 
to perform polymerase chain reaction (PCR) reaction 
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generates azido-labeled DNA template (2) for coupling 
with the triarylphosphine-modified surface (Figure 3) . 



The self-primed DNA template moiety on the sequencing 
5 chip is constructed as shown in Figure 6 (A & B) using 

enzymatic ligation. A 5' -phosphorylated, 3' -OH capped 
loop oligonucleotide primer (B) is synthesized by a 
solid phase DNA synthesizer. Primer (B) is synthesized 
using a modified C phosphoramidite whose 3' -OH is capped 

10 with either a MOM (-CH 2 OCH 3 ) group or an allyl (- 

CH 2 CH=CH 2 ) group (designated by "R" in Figure 6) at the 
3' -end of the oligonucleotide to prevent the self 
ligation of the primer in the ligation reaction. Thus, 
the looped primer can only ligate to the 3' -end of the 

15 DNA templates that are immobilized on the sequencing 

chip using T4 RNA ligase (Zhang et al. 1996) to form the 
self-primed DNA template moiety (A) . The looped primer 
(B) is designed to contain a very stable loop (Antao et 
al. 1991) and a stem containing the sequence of M13 

20 reverse DNA sequencing primer for efficient priming in 

the polymerase reaction once the primer is ligated to 
the immobilized DNA on the sequencing chip and the 3' -OH 
cap group is chemically cleaved off (Ireland et al. 
1986; Kamal et al . 1999). 

25 

3. Sequencing by Synthesis Evaluation Using Nucleotide 

Analogues 3' -HO~A-Dy e i , 3' -HO~C~Dye2 ' 3' -HO~G~Dye3 , 3/ -HO~T~Dye4 



A scheme has been developed for evaluating the 
30 photocleavage efficiency using different dyes and 

testing the sequencing by synthesis approach. Four 
nucleotide analogues 3 '-HO-A-Dyei, 3'-H0-C- Dye2 , 3'-HcrG- Dye3 , 3'- 
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Ho~T- Dye 4 each labeled with a unique fluorescent dye 
through a photocleavable linker are synthesized and used 
in the sequencing by synthesis approach. Examples of 
dyes include, but are not limited to: Dyel = FAM, 5- 



Dye3 = TAM, N, N, N T , N 1 -tetramethyl-6-carboxyrhodamine; 
and Dye4 = ROX, 6-carboxy-X-rhodamine . The structures 
of the 4 nucleotide analogues are shown in Figure 7 (R = 

H) . 

.0 

The photocleavable 2-nitrobenzyl moiety has been used to 
link biotin to DNA and protein for efficient removal by 
PV light (~ 350 nm) (Olejnik et al . 1995, 1999). In the 
approach disclosed herein the 2-nitrobenzyl group is 
.5 used to bridge the fluorescent dye and nucleotide 

together to form the dye labeled nucleotides as shown in 
Figure 7 . 

As a representative example, the synthesis of 3'-Ho-G- Dy e3 
0 (Dye3 = Tarn) is shown in Figure 8. 7-deaza- 

alkynylamino-dGTP is prepared using well-established 
procedures (Prober et al. 1987; Lee et al. 1992 and 
Hobbs et al. 1991) . Linker-Tarn is synthesized by 
coupling the Photocleavable Linker (Rollaf 1982) with 
5 NHS-Tam. 7-deaza-alkynylamino-dGTP is then coupled with 

the Linker-Tarn to produce 3 /- HO -G- tam. The nucleotide 
analogues with a free 3 f -OH (i.e., R - H) are good 
substrates for the polymerase. An immobilized DNA 
template is synthesized (Figure 9) that contains a 
0 portion of nucleotide sequence ACGTACGACGT (SEQ ID NO: 

I) that has no repeated sequences after the priming 
-site. 3'-Ho _ ^- D y el and DNA polymerase are added to the 



5 



carboxyfluorescein; Dye2 = R6G, 6-carboxyrhodamine-6G; 
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self -primed DNA moiety and it is incorporated to the 3 ! 
site of the DNA. Then the steps in Figure 2A are 
followed (the chemical cleavage step is not required 
here because the 3 ! -OH is free) to detect the 
5 fluorescent signal from Dye-1 at 520 nm. Next, 3 ^ H o~C- 

Dy e2 is added to image the fluorescent signal from Dye-2 
at 550 nm. Next, 3'-Ho-G- Dye 3 is added to image the 
fluorescent signal from Dye-3 at 580 nm, and finally 3 *- 
Ho-T- Dy e4 is added to image the fluorescent signal from 
10 Dye-4 at 610 nm. 

Results on photochemical cleavage efficiency 

The expected photolysis products of DNA containing a 
photocleavable fluorescent dye at the 3 f end of the DNA 

15 are shown in Figure 10. The 2-nitrobenzyl moiety has 

been successfully employed in a wide range of studies as 
a photocleavable-protecting group (Pillai 1980) . The 
efficiency of the photocleavage step depends on several 
factors including the efficiency of light absorption by 

20 the 2-nitrobenzyl moiety, the efficiency of the primary 

photochemical step, and the efficiency of the secondary 
thermal processes which lead to the final cleavage 
process (Turro 1991) . Burgess et al. (1997) have 
reported the successful photocleavage of a fluorescent 

25 dye attached through a 2-nitrobenzyl linker on a 

nucleotide moiety, which shows that the fluorescent dye 
is not quenching the photocleavage process. A 
photoliable protecting group based on the 2-nitrobenzyl 
chromophore has also been developed for biological 

30 labeling applications that involve photocleavage 

(Olejnik et al. 1999) . The protocol disclosed herein is 
used to optimize the photocleavage process shown in 
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Figure 10. The absorption spectra of 2-nitro benzyl 
compounds are examined and compared quantitatively to 
the absorption spectra of the fluorescent dyes. Since 
there will be a one-to-one relationship between the 
5 number of 2-nitrobenzyl moieties and the dye molecules, 

the ratio of extinction coefficients of these two 
species will reflect the competition for light 
absorption at specific wavelengths. From this 

information, the wavelengths at which the 2-nitrobenzyl 
10 moieties absorbed most competitively can be determined, 

similar to the approach reported by Olejnik et 
al- (1995) . 



A photolysis setup can be used which allows a high 
15 throughput of monochromatic light from a 1000 watt high 

pressure xenon lamp (LX1000UV, ILC) in conjunction with 
a monochromator (Kratos, Schoeffel Instruments) . This 
instrument allows the evaluation of the photocleavage of 
model systems as a function of the intensity and 
20 excitation wavelength of the absorbed light. Standard 

analytical analysis is used to determine the extent of 
photocleavage. From this information, the efficiency of 
the photocleavage as a function of wavelength can be 
determined. The wavelength at which photocleavage 
25 occurs most efficiently can be selected as for use in 

the sequencing system. 



Photocleavage results have been obtained using a model 
system as shown in Figure 11. Coupling of PC-LC- 
30 Biotin-NHS ester (Pierce, Rockford IL) with 5- 

(aminoacetamido) -fluorescein (5-aminoFAM) (Molecular 
Probes, Eugene OR) in dimethylsulf onyl oxide 
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(DMSO) /NaHC0 3 (pH=8.2) overnight at room temperature 
produces PC-LC-Biotin-FAM which is composed of a biotin 
at one end, a photocleavable 2-nitrobenzyl group in the 
middle, and a dye tag (FAM) at the other end. This 
5 photocleavable moiety closely mimics the designed 

photocleavable nucleotide analogues shown in Figure 10. 
Thus the successful photolysis of the PC-LC-Biotin-FAM 
moiety provides proof of the principle of high 
efficiency photolysis as used in the DNA sequencing 

10 system. For photolysis study, PC-LC-Biotin-FAM is first 

immobilized on a microscope glass slide coated with 
streptavidin (XENOPORE, Hawthorne NJ) . After washing 
off the non-immobilized PC-LC-Biotin-FAM, the 
fluorescence emission spectrum of the immobilized PC-LC- 

15 Biotin-FAM was taken as shown in Figure 12 (Spectrum a) . 

The strong fluorescence emission indicates that PC-LC- 
Biotin-FAM is successfully immobilized to the 
streptavidin coated slide surface. The 
photocleavability of the 2-nitrobenzyl linker by 

20 irradiation at 350 nm was then tested. After 10 minutes 

of photolysis (A irr = 350 nm; -0.5 mW/cm 2 ) and before any 
washing, the fluorescence emission spectrum of the same 
spot on the slide was taken that showed no decrease in 
intensity (Figure 12, Spectrum b) , indicating that the 

25 dye (FAM) was not bleached during the photolysis process 

at 350 nm. After washing the glass slide with HPLC 
water following photolysis, the fluorescence emission 
spectrum of the same spot on the slide showed 
significant intensity decrease (Figure 12, Spectrum c) 

30 which indicates that most of the fluorescence dye ( FAM) 

was cleaved from the immobilized biotin moiety and was 
removed by the washing procedure. This experiment shows 
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that high efficiency cleavage of the fluorescent dye can 
be obtained using the 2-nitrobenzyl photocleavable 
linker . 

5 4. Sequencing by Synthesis Evaluation Using Nucleotide 

Analogues 3' -RO-A-Dyel, 3'-RO"C- Dye2 , 3'-RO~G-Dye3 r 3'-RO~T-py e4 

Once the steps and conditions in Section 3 are 
optimized, the synthesis of nucleotide analogues 3'-ro~A~ 

10 Dyei/ 3 r -Ro~C-Dye2/ 3' -Ro~G~*Dye3 ' 3'-Ro"T- Dye 4 can be pursued for 

further study of the system. Here the 3 ' -OH is capped 
in all four nucleotide analogues, which then can be 
mixed together with DNA polymerase and used to evaluate 
the sequencing system using the scheme in Figure 9. The 

15 MOM (-CH2OCH3) or allyl (-CH 2 CH=CH 2 ) group is used to cap 

the 3 ! -OH group using well-established synthetic 
procedures (Figure 13) (Fuji et al . 1975, Metzker et al. 
1994) . These groups can be removed chemically with high 
yield as shown in Figure 14 (Ireland, et al. 1986; Kamal 

20 et al. 1999). The chemical cleavage of the MOM and 

allyl groups is fairly mild and specific, so as not to 
degrade the DNA template moiety. For example, the 
cleavage of the allyl group takes 3 minutes with more 
than 93% yield (Kamal et al. 1999), while the MOM group 

25 is reported to be cleaved with close to 100% yield 

(Ireland, et al. 1986). 

5. Using Energy Transfer Coupled Dyes To Optimize The 
Sequencing By Synthesis System 

30 

The spectral property of the fluorescent tags can be 
optimized by using energy transfer (ET) coupled dyes. 
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The ET primer and ET dideoxynucleotides have been shown 
to be a superior set of reagents for 4-color DNA 
sequencing that allows the use of one laser to excite 
multiple sets of fluorescent tags (Ju et al . 1995). It 
5 has been shown that DNA polymerase (Thermo Sequenase and 

Taq FS) can efficiently incorporate the ET dye labeled 
dideoxynucleotides (Rosenblum et al . 1997) . These ET 
dye-labeled sequencing reagents are now widely used in 
large scale DNA sequencing projects , such as the human 

10 genome project. A library of ET dye labeled nucleotide 

analogues can be synthesized as shown in Figure 15 for 
optimization of the DNA sequencing system. The ET dye 
set (FAM-C1 2 FAM, FAM-CI2R6G, FAM-CI2TAM, FAM-C1 2 R0X) using 
FAM as a donor and dichloro (FAM, R6G, TAM, ROX) as 

15 acceptors has been reported in the literature (Lee et 

al. 1997) and constitutes a set of commercially 
available DNA sequencing reagents. These ET dye sets 
have been proven to produce enhanced fluorescence 
intensity, and the nucleotides labeled with these ET 

20 dyes at the 5-position of T and C and the 7-position of 

G and A are excellent substrates of DNA polymerase. 
Alternatively, an ET dye set can be constructed using 
cyanine (Cy2) as a donor and C1 2 FAM, C1 2 R6G, C1 2 TAM, or 
C1 2 R0X as energy acceptors. Since Cy2 possesses higher 

25 molar absorbance compared with the rhodamine and 

fluorescein derivatives, an ET system using Cy2 as a 
donor produces much stronger fluorescence signals than 
the system using FAM as a donor (Hung et al. 1996) . 
Figure 16 shows a synthetic scheme for an ET dye labeled 

30 nucleotide analogue with Cy2 as a donor and C1 2 FAM as an 

acceptor using similar coupling chemistry as for the 
synthesis of an energy transfer system using FAM as a 
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donor (Lee et al. 1997). Coupling of C1 2 FAM (I) with 
spacer 4-aminomethylbenzoic acid (II) produces III, 
which is then converted to NHS ester IV. Coupling of IV 
with amino-Cy2, and then converting the resulting 
5 compound to a NHS ester produces V, which subsequently 

couples with amino-photolinker nucleotide VI yields the 
ET dye labeled nucleotide VII. 

6. Sequencing by synthesis evaluation using nucleotide 
10 analogues 3 , - H o-A- Tag i , 3 ,_ HO -C- Tag2 , 3 ,_ HO -G- Tag3 , 3 '-ho~T- Tag4 

The precursors of four examples of mass tags are shown 
in Figure 17. The precursors are: (a) acetophenone; (b) 
3-fluoroacetophenone; (c) 3, 4-dif luoroacetophenone; and 

15 (d) 3, 4-dimethoxyacetophenone . Upon nitration and 

reduction, four photoactive tags are produced from the 
four precursors and used to code for the identity of 
each of the four nucleotides (A, C, G, T) . Clean APCI 
mass spectra are obtained for the four mass tag 

20 precursors (a, b, c, d) as shown in Figure 18. The peak 

with m/z of 121 is a, 139 is b, 157 is c, and 181 is d. 
This result shows that these four mass tags are 
extremely stable and produce very high resolution data 
in an APCI mass spectrometer with no cross talk between 

25 the mass tags. In the examples shown below, each of the 

unique m/z from each mass tag translates to the identity 
of the nucleotide [Tag-1 (m/z, 150) - A; Tag-2 (m/z, 168) 
- C; Tag-3 (m/z, 186) = G; Tag-4 (m/z, 210) = T] . 

30 Different combinations of mass tags and nucleotides can 

be used, as indicated by the general scheme: 3'-Ho-A- Tag i, 
3'-Ho-C- Ta g 2 , 3'-Ho-G-Tag3/ 3'-Ho-T- Tag 4 where Tagl, Tag2, Tag3, 
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and Tag4 are four different unique cleavable mass tags . 
Four specific examples of nucleotide analogues are shown 
in Figure 19. In Figure 19, "R" is H when the 3' -OH 
group is not capped. As discussed above, the photo 
5 cleavable 2-nitro benzyl moiety has been used to link 

biotin to DNA and protein for efficient removal by UV 
light (~ 350 nm) irradiation (Olejnik et al . 1995, 
1999) . Four different 2-nitro benzyl groups with 
different molecular weights as mass tags are used to 

10 form the mass tag labeled nucleotides as shown in Figure 

19: 2-nitro-a-methyl-benzyl (Tag-1) codes for A; 2- 
nitro-a-methyl-3-f luorobenzyl (Tag-2) codes for C; 2- 
nitro-a-methyl-3, 4-dif luorobenzyl (Tag-3) codes for G; 
2-nitro-a-methyl-3, 4-dimethoxybenzyl (Tag-4) codes for 

15 T. 

As a representative example, the synthesis of the NHS 
ester of one mass tag (Tag-3) is shown in Figure 20. A 
similar scheme is used to create the other mass tags. 
The synthesis of 3'-HO-G- Tag 3 is shown in Figure 21 using 
well-established procedures (Prober et al. 1987; Lee et 
al. 1992 and Hobbs et al. 1991) . 7-propargylamino- dGTP 
is first prepared by reacting 7-I-dGTP with N- 
trif luoroacetylpropargyl amine, which is then coupled 
with the NHS-Tag-3 to produce 3'- H o-G- Ta g3. The 
nucleotide analogues with a free 3 1 -OH are good 
substrates for the polymerase. 

The sequencing by synthesis approach can be tested using 
30 mass tags using a scheme similar to that show for dyes 

in Figure 9. A DNA template containing a portion of 
nucleotide sequence that has no repeated sequences after 



20 



25 
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the priming site, is synthesized and immobilized to a 
glass channel. 3'-Ho~A- T agi and DNA polymerase are added to 
the self-primed DNA moiety to allow the incorporation of 
the nucleotide into the 3 1 site of the DNA. Then the 
5 steps in Figure 2B are followed (the chemical cleavage 

is not required here because the 3 1 -OH is free) to 
detect the mass tag from Tag-1 (m/z = 150) . Next, 3»-ho" 
C- Tag2 is added and the resulting mass spectra is 
measured after cleaving Tag-2 (m/z = 168) ♦ Next, 3'-ho~G- 

10 T ag3 and 3'-ho-T- Tag 4 are added in turn and the mass spectra 

of the cleavage products Tag-3 (m/z =18 6) and Tag-4 (m/z 
= 210) are measured. Examples of expected photocleavage 
products are shown in Figure 22. The photocleavage 
mechanism is as described above for the case where the 

15 unique labels are dyes. Light absorption (300 - 360 nm) 

by the aromatic 2-nitro benzyl moiety causes reduction 
of the 2-nitro group to a nitroso group and an oxygen 
insertion into the carbon-hydrogen bond located in the 
2-position followed by cleavage and decarboxylation 

20 (Pillai 1980) . 

The synthesis of nucleotide analogues 3'-Ro"A- T agi/ 3'-ro~C- 
Tag2f 3' -Ro-G- Tag3 , 3'-RO-T- Tag4 can be pursued for further 
study of the system a discussed above for the case where 

25 the unique labels are dyes. Here the 3' -OH is capped in 

all four nucleotide analogues, which then can be mixed 
together with DNA polymerase and used to evaluate the 
sequencing system using a scheme similar to that in 
Figure 9. The MOM (-CH 2 OCH 3 ) or allyl (-CH 2 CH=CH 2 ) group 

30 is used to cap the 3 1 -OH group using well-established 

synthetic procedures (Figure 13) (Fuji et al . 1975, 
Metzker et al. 1994). These groups can be removed 
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chemically with high yield as shown in Figure 14 
(Ireland, et al. 1986; Kamal et al. 1999). The chemical 
cleavage of the MOM and allyl groups is fairly mild and 
specific, so as not to degrade the DNA template moiety. 

5 

7 . Parallel Channel System for Sequencing by Synthesis 
Figure 23 illustrates an example of a parallel channel 
system. The system can be used with mass tag labels as 
shown and also with dye labels. A plurality of channels 

10 in a silica glass chip are connected on each end of the 

channel to a well in a well plate. In the example shown 
there are 96 channels each connected to its own wells. 
The sequencing system also permits a number of channels 
other than 96 to be used. 96 channel devices for 

15 separating DNA sequencing and sizing fragments have been 

reported (Woolley and Mathies 1994, Woolley et al. 1997, 
Simpson et al . 1998). The chip is made by 
photolithographic masking and chemical etching 
techniques. The photolithographically defined channel 

20 patterns are etched in a silica glass substrate, and 

then capillary channels (id ~ 100 \m) are formed by 
thermally bonding the etched substrate to a second 
silica glass slide. Channels are porous to increase 
surface area. The immobilized single stranded DNA 

25 template chip is prepared according to the scheme shown 

in Figure 3. Each channel is first treated with 0.5 M 
NaOH, washed with water, and is then coated with high 
density 3-aminopropyltrimethoxysilane in aqueous ethanol 
(Woolley et al . 1994) forming a primary amine surface. 

30 Succinimidyl (NHS) ester of triarylphosphine (1) is 

covalently coupled with the primary amine group 
converting the amine surface to a novel triarylphosphine 
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surface, which specifically reacts with DNA containing 
an azido group (2) forming a chip with immobilized DNA. 
Since the azido group is only located at the 5 1 end of 
the DNA and the coupling reaction is through the unique 
5 reaction of triarylphosphine moiety with azido group in 

aqueous solution (Saxon and Bertozzi 2000) , such a DNA 
surface provides an optimized condition for 
hybridization. Fluids, such as sequencing reagents and 
washing solutions, can be easily pressure driven between 

10 the two 96 well plates to wash and add reagents to each 

channel in the chip for carrying out the polymerase 
reaction as well as collecting the photocleaved labels. 
The silica chip is transparent to ultraviolet light (A ~ 
350 .run) . In the Figure, photocleaved mass tags are 

15 detected by an APCI mass spectrometer upon irradiation 

with a UV light source. 

8. Parallel Mass Tag Sequencing by Synthesis System 

The approach disclosed herein comprises detecting four 
20 unique photoreleased mass tags, which can have molecular 

weights from 150 to 250 daltons, to decode the DNA 
sequence, thereby obviating the issue of detecting large 
DNA fragments using a mass spectrometer as well as the 
stringent sample requirement for using mass spectrometry 
25 to directly detect long DNA fragments. It takes 10 

seconds or less to analyze each mass tag using the APCI 
mass spectrometer. With 8 miniaturized APCI mass 
spectrometers in a system, close to 100,000 bp of high 
quality digital DNA sequencing data could be generated 
30 each day by each instrument using this approach. Since 

there is no separation and purification requirements 
using this approach, such a system is cost effective. 
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To make mass spectrometry competitive with a 96 
capillary array method for analyzing DNA, a parallel 
mass spectrometer approach is needed. Such a complete 
5 system has not been reported mainly due to the fact that 

most of the mass spectrometers are designed to achieve 
adequate resolution for large biomolecules . The system 
disclosed herein requires the detection of four mass 
tags, with molecular weight range between 150 and 250 

10 daltons, coding for the identity of the four nucleotides 

(A, C, G, T) . Since a mass spectrometer dedicated to 
detection of these mass tags only requires high 
resolution for the mass range of 150 to 250 daltons 
instead of covering a wide mass range, the mass 

15 spectrometer can be miniaturized and have a simple 

design. Either quadrupole (including ion trap detector) 
or time-of-f light mass spectrometers can be selected for 
the ion optics. While modern mass spectrometer 

technology has made it possible to produce miniaturized 

20 mass spectrometers, most current research has focused on 

the design of a single stand-alone miniaturized mass 
spectrometer. Individual components of the mass 

spectrometer has been miniaturized for enhancing the 
mass spectrometer analysis capability (Liu et al. 2000, 

25 Zhang et al. 1999). A miniaturized mass spectrometry 

system using multiple analyzers (up to 10) in parallel 
has been reported (Badman and Cooks 2000) . However, 
the mass spectrometer of Badman and Cook was designed to 
measure only single samples rather than multiple samples 

30 in parallel. They also noted that the miniaturization 

of the ion trap limited the capability of the mass 
spectrometer to scan wide mass ranges. Since the 
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approach disclosed herein focuses on detecting four 
small stable mass tags (the mass range is less than 300 
daltons), multiple miniaturized APCI mass spectrometers 
are easily constructed and assembled into a single unit 
5 for parallel analysis of the mass tags for DNA 

sequencing analysis. 

A complete parallel mass spectrometry system includes 
multiple APCI sources interfaced with multiple 

10 analyzers, coupled with appropriate electronics and 

power supply configuration. A mass spectrometry system 
with parallel detection capability will overcome the 
throughput bottleneck issue for application in DNA 
analysis. A parallel system containing multiple mass 

15 spectrometers in a single device is illustrated in 

Figures 23 and 24. The examples in the figures show a 
system with three mass spectrometers in parallel. Higher 
throughput is obtained using a greater number of in 
parallel mass spectrometers. 

20 

As illustrated in Figure 24, the three miniature mass 
spectrometers are contained in one device with two 
turbo-pumps. Samples are injected into the ion source 
where they are mixed with a nebulizer gas and ionized. 

25 One turbo pump is used as a differential pumping system 

to continuously sweep away free radicals, neutral 
compounds and other undesirable elements coming from the 
ion source at the orifice between the ion source and the 
analyzer. The second turbo pump is used to generate a 

30 continuous vacuum in all three analyzers and detectors 

simultaneously. Since the corona discharge mode and 
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scanning mode of mass spectrometers are the same for 
each miniaturized mass spectrometer, one power supply 
for each analyzer and the ionization source can provide 
the necessary power for all three instruments. One 
5 power supply for each of the three independent detectors 

is used for spectrum collection. The data obtained are 
transferred to three independent A/D converters and 
"processed by the data system simultaneously to identify 
the mass tag in the injected sample and thus identify 
10 the nucleotide. Despite containing three mass 

spectrometers, the entire device is able to fit on a 
laboratory bench top. 

9. Validate the Complete Sequencing by Synthesis System 
By Sequencing P53 Genes 

The tumor suppressor gene p53 can be used as a model 
system to validate the DNA sequencing system. The p53 
gene is one of the most frequently mutated genes in 
human cancer (O'Connor et al. 1997). First, a base pair 
DNA template (shown below) is synthesized containing an 
azido group at the 5 f end and a portion of the sequences 
from exon 7 and exon 8 of the p53 gene: 

5 ' -N 3 -TTCCTGCATGGGCGGCATGAACCCGAGGCCCATCCTCACCATCATCAC 
25 ACTGGAAGACTCCAGTGGTAATCTACTGGGACGGAACAGCTTTGAGGTGCATT 
-3' (SEQ ID NO: 2) . 

This template is chosen to explore the use of the 
sequencing system for the detection of clustered hot 
30 spot single base mutations. The potentially mutated 

bases are underlined (A, G, C and T) in the synthetic 
template. The synthetic template is immobilized on a 



15 



20 
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sequencing chip or glass channels, then the loop primer 
is ligated to the immobilized template as described in 
Figure 6, and then the steps in Figure 2 are followed 
for sequencing evaluation. DNA templates generated by 
5 PCR can be used to further validate the DNA sequencing 

system. The sequencing templates can be generated by 
PCR using flanking primers (one of the pair is labeled 
with an azido group at the 5' end) in the intron region 
located at each p53 exon boundary from a pool of genomic 
10 DNA (Boehringer, Indianapolis, IN) as described by Fu et 

al. (1998) and then immobilized on the DNA chip for 
sequencing evaluation. 
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What is claimed is: 

1. A method for sequencing a nucleic acid by detecting 
the identity of a nucleotide analogue after the 
nucleotide analogue is incorporated into a growing 
strand of DNA in a polymerase reaction, which 
comprises the following steps: 

(i) attaching a 5' end of the nucleic acid to a 
solid surface; 

(ii) attaching a primer to the nucleic acid 
attached to the solid surface; 

(iii) adding a polymerase and one or more different 
nucleotide analogues to the nucleic acid to 
thereby incorporate a nucleotide analogue into 
the growing strand of DNA, wherein the 
incorporated nucleotide analogue terminates 
the polymerase reaction and wherein each 
different nucleotide analogue comprises (a) a 
base selected from the group consisting of 
adenine, guanine, cytosine, thymine, and 
uracil, and their analogues; (b) a unique 
label attached through a cleavable linker to 
the base or to an analogue of the base; (c) a 
deoxyribose; and (d) a cleavable chemical 
group to cap an -OH group at a 3' -position of 
the deoxyribose; 



(iv) washing the solid surface to remove 
unincorporated nucleotide analogues; 
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(v) detecting the unique label attached to the 
nucleotide analogue that has been incorporated 
into the growing strand of DNA, so as to 

5 thereby identify the incorporated nucleotide 

analogue; 

(vi) adding one or more chemical compounds to 
permanently cap any unreacted -OH group on the 

10 primer attached to the nucleic acid or on a 

primer extension strand formed by adding one 
or more nucleotides or nucleotide analogues to 
the primer; 



15 (vii) cleaving the cleavable linker between the 

nucleotide analogue that was incorporated into 
the growing strand of DNA and the unique 
label; 

20 (viii) cleaving the cleavable chemical group 

capping the -OH group at the 3' -position of 
the deoxyribose to uncap the -OH group, and 
washing the solid surface to remove cleaved 
compounds ; and 

25 

(ix) repeating steps (iii) through (viii) so as to 
detect the identity of a newly incorporated 
nucleotide analogue into the growing strand of 
DNA; 
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wherein if the unique label is a dye, the order of 
steps (v) through (vii) is: (v) , (vi) , and (vii); 
and 

wherein if the unique label is a mass tag, the 
order of steps (v) through (vii) is: (vi) , (vii), 
and (v) . 

2. The method of claim 1, wherein the solid surface is 
glass, silicon, or gold. 

3. The method of claim 1, wherein the solid surface is 
a magnetic bead, a chip, a channel in a chip, or a 
porous channel in a chip. 

4. The method of claim 1, wherein the step of 
attaching the nucleic acid to the solid surface 
comprises : 

(i) coating the solid surface with a phosphine 
moiety, 

(ii) attaching an azido group to the 5' end of the 
nucleic acid, and 

(iii) immobilizing the 5' end of the nucleic acid 
to "the solid surface through interaction 
between the phosphine moiety on the solid 
surface and the azido group on the 5' end of 
the nucleic acid. 
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5. The method of claim 4, wherein the step of coating 
the solid surface with the phosphine moiety 
comprises : 



10 



(i) coating the surface with a primary amine, and 

(ii) covalently coupling a N-hydroxysuccinimidyl 
ester of triarylphosphine with the primary 
amine - 

6. The method of claim 1, wherein the nucleic acid 
that is attached to the solid surface is a single- 
stranded DNA. 



15 7. The method of claim 1, wherein the nucleic acid 

that is attached to the solid surface in step (i) 
is a double-stranded DNA, wherein only one strand 
is directly attached to the solid surface, and 
wherein the strand that is not directly attached to 

20 the solid surface is removed by denaturing before 

proceeding to step (ii) . 

8. The method of claim 1, wherein the nucleic acid 

that is attached to the solid surface is a RNA, and 

25 the polymerase in step (iii) is reverse 
transcriptase . 



9. The method of claim 1, wherein the primer is 
attached to a 3' end of the nucleic acid in step 
3 0 (ii) and wherein the attached primer comprises a 

stable loop and an -OH group at a 3' -position of a 
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deoxyribose capable of self-priming in the 
polymerase reaction. 



10. The method of claim 1, wherein the step of 
5 attaching the primer to the nucleic acid comprises 

hybridizing the primer to the nucleic acid or 
ligating the primer to the nucleic acid. 

11. The method of claim 1, wherein one or more of four 
10 different nucleotide analogues is added in step 

(iii) , wherein each different nucleotide analogue 
comprises a different base selected from the group 
consisting of thymine or uracil or an analogue of 
thymine or uracil, adenine or an analogue of 
15 adenine, cytosine or an analogue of cytosine, and 

guanine or an analogue of guanine, and wherein each 
of the four different nucleotide analogues 
comprises a unique label. 

20 12. The method of claim 1, wherein the cleavable 

chemical group that caps the -OH group at the 3'- 
position of the deoxyribose in the nucleotide 
analogue is -CH 2 OCH 3 or -CH 2 CH=CH 2 . 



25 13. The method of claim 1, 

that is attached to the 
fluorescent moiety or a 
crystal . 



wherein the unique label 
nucleotide analogue is a 
fluorescent semiconductor 



30 14. 



The method of claim 13, wherein the fluorescent 
moiety is selected from the group consisting of 5- 
carboxyfluorescein, 6-carboxyrhodamine~6G, 
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N, N, N 1 ,N ! -tetramethyl-6-carboxyrhodamine, and 
carboxy-X-rhodamine . 



15. The method of claim 1, wherein the unique label 
5 that is attached to the nucleotide analogue is a 

fluorescence energy transfer tag which comprises an 
energy transfer donor and an energy transfer 
acceptor. 

10 16. The method of claim 15, wherein the energy transfer 

donor is 5-carboxyf luorescein or cyanine, and 
wherein the energy transfer acceptor is selected 
from the group consisting of 

dichlorocarboxyf luorescein, dichloro-6- 

15 carboxyrhodamine-6G, dichloro-N, N, N 1 ,N'- 

tetramethyl-6-carboxyrhodamine, and dichloro-6- 
carboxy-X-rhodamine . 

17 . The method of claim 1, wherein the unique label 
20 that is attached to the nucleotide analogue is a 

mass tag that can be detected and differentiated by 
a mass spectrometer. 

18. The method of claim 17, wherein the mass tag is 
25 selected from the group consisting of a 2-nitro-a- 

methyl-benzyl group, a 2-nitro-a~methyl-3- 

fluorobenzyl group, a 2-nitro-a-methyl-3, 4- 

dif luorobenzyl group, and a 2-nitro-a-methyl-3, 4- 
dimethoxybenzyl group . 



30 



19. The method of claim 1, wherein the unique label is 
attached through a cleavable linker to a 5-position 
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of cytosine or thymine or to a 7-position of deaza- 
adenine or deaza-guanine . 



20. The method of claim 1, wherein the cleavable linker 
5 between the unique label and the nucleotide 

analogue is cleaved by a means selected from the 
group consisting of one or more of a physical 
means, a chemical means, a physical chemical means, 
heat, and light. 

10 

21. The method of claim 20, wherein the cleavable 
linker is a photocleavable linker which comprises a 
2-nitrobenzyl moiety. 

22. The method of claim 1, wherein the cleavable 
chemical group used to cap the -OH group at the 3'- 
position of the deoxyribose is cleaved by a means 
selected from the group consisting of one or more 
of a physical means, a chemical means, a physical 
chemical means, heat, and light. 

23. The method of claim 1, wherein the chemical 
compounds added in step (vi) to permanently cap any 
unreacted -OH group on the primer attached to the 
nucleic acid or on the primer extension strand are 
a polymerase and one or more different 
dideoxynucleotides or analogues of 
dideoxynucleotides . 



15 



20 
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?4. The method of claim 23, wherein the different 
dideoxynucleotides are selected from the group 
consisting of 2' , 3' -dideoxyadenosine 5'- 
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triphosphate, 
triphosphate, 
triphosphate, 



2 r , 3' -dideoxyguanosine 
2' , 3' -dideoxycytidine 
2 r ,3' -dideoxy thymidine 



5'- 



5'- 



5'- 



triphosphate, 2' , 3' -dideoxyuridine 5' -triphosphase, 
and their analogues. 

The method of claim 1, wherein a polymerase and one 
or more of four different dideoxynucleotides are 
added in step (vi), and wherein each different 
dideoxynucleotide is selected from the group 
consisting of 2' , 3' -dideoxyadenosine 5'- 

triphosphate or an analogue of 2',3'- 
dideoxyadenosine 5' -triphosphate; 2' , 3' - 

dideoxyguanosine 5' -triphosphate or an analogue of 
2' , 3' -dideoxyguanosine 5' -triphosphate; 2' , 3' - 
dideoxycytidine 5' -triphosphate or an analogue of 
2' , 3' -dideoxycytidine 5 ; -triphosphate; and 2',3'- 
dideoxy thymidine 5 ' -triphosphate or 2 9 , 3 ' - 
dideoxyuridine 5' -triphosphase or an analogue of 
2' , 3 r -dideoxy thymidine 5' -triphosphate or an 
analogue of 2' , 3' -dideoxyuridine 5' -triphosphase . 

The method of claim 17 , wherein the mass tag is 
detected using a parallel mass spectrometry system 
which comprises a plurality of atmospheric pressure 
chemical ionization mass spectrometers for parallel 
analysis of a plurality of samples comprising mass 
tags . 



A method of simultaneously sequencing a plurality 
of different nucleic acids, which comprises 
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simultaneously applying the method of claim 1 to 
the plurality of different nucleic acids. 

28. Use of the method of claim 1 or 27 for detection of 
single nucleotide polymorphisms, genetic mutation 
analysis, serial analysis of gene expression, gene 
expression analysis, identification in forensics, 
genetic disease association studies, DNA 
sequencing, genomic sequencing, translational 
analysis, or transcriptional analysis. 

29. A method of attaching a nucleic acid to a solid 
surface which comprises: 

(i) coating the solid surface with a phosphine 



(iii) immobilizing the 5' end of the nucleic acid 
to the solid surface through interaction 
between the phosphine moiety on the solid 
surface and the azido group on the 5' end of 
the nucleic acid. 

30. The method of claim 29, wherein the step of coating 
the solid surface with the phosphine moiety 
comprises : 



moiety, 



(ii) 



attaching an azido group to a 5' end of the 
nucleic acid, and 



(i) coating the surface with a primary amine, 
and 
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(ii) 



covalently coupling a N-hydroxysuccinimidyl 
ester of triarylphosphine with the primary 
amine . 



10 



15 



20 



25 



31. The method of claim 29, wherein the solid surface 
is glass, silicon, or gold. 

32. The method of claim 29, wherein the solid surface 
is a magnetic bead, a chip, a channel in a chip, or 
a porous channel in a chip. 

33. The method of claim 29, wherein the nucleic acid 
that is attached to the solid surface is a single- 
stranded DNA, a double-stranded DNA or a RNA. 

34. The method of claim 33, wherein the nucleic acid is 
a double-stranded DNA and only one strand is 
attached to the solid surface. 

35. The method of claim 34, wherein the strand of the 
double-stranded DNA that is not attached to the 
solid surface is removed by denaturing. 

36. Use of the method of claim 29 for gene expression 
analysis, microarray based gene expression 
analysis, mutation detection, translational 
analysis, or transcriptional analysis. 

37. A nucleotide analogue which comprises: 



WO 02/29003 ^^PCT/USOl/31243 



10 



15 



-83- 

(a) a base selected from the group consisting of 
adenine or an analogue of adenine, cytosine or 
an analogue of cytosine, guanine or an 
analogue of guanine, thymine or an analogue of 
thymine, and uracil or an analogue of uracil; 

(b) a unique label attached through a cleavable 
linker to the base or to an analogue of the 
base; 

(c) a deoxyribose; and 

(d) a cleavable chemical group to cap an -OH group 
at a 3' -position of the deoxyribose. 



38. The nucleotide analogue of claim 37, wherein the 
cleavable chemical group that caps the -OH group at 
the 3' -position of the deoxyribose is -CH 2 OCH 3 or 
— CH2CH=CH2 • 

20 

39. The nucleotide analogue of claim 37, wherein the 
unique label is a fluorescent moiety or a 
fluorescent semiconductor crystal. 



25 40. The nucleotide analogue of claim 39, wherein the 

fluorescent moiety is selected from the group 
consisting of 5-carboxyf luorescein, 6- 

carboxyrhodamine-6G, N, N, N 1 , N 1 -tetramethyl-6- 

carboxyrhodamine, and 6-carboxy-X-rhodamine . 

30 

41. The nucleotide analogue of claim 37, wherein the 
unique label is a fluorescence energy transfer tag 




WO 02/29003 ^^PCT/USOl/31243 

-84- 

which comprises an energy transfer donor and an 
energy transfer acceptor. 



42. The nucleotide analogue of claim 41, wherein the 
5 energy transfer donor is 5-carboxyf luorescein or 

cyanine, and wherein the energy transfer acceptor 
is selected from the group consisting of 
dichlorocarboxyf luorescein, dichloro-6- 
carboxyrhodamine-6G, dichloro-N, N, N 1 , N 1 - 

10 tetramethyl-6-carboxyrhodamine, and dichloro-6- 

carboxy-X-rhodamine . 



43. The nucleotide analogue of claim 37, wherein the 
unique label is a mass tag that can be detected and 
15 differentiated by a mass spectrometer. 



44. The nucleotide analogue of claim 43, wherein the 
mass tag is selected from the group consisting of a 
2-nitro-cx-methyl-benzyl group, a 2-nitro-a-methyl- 
20 3-f luorobenzyl group, a 2-nitro-a-methyl-3, 4- 

dif luorobenzyl group, and a 2-nitro-a-methyl-3, 4- 
dimethoxybenzyl group. 

The nucleotide analogue of claim 37, wherein the 
unique label is attached through a cleavable linker 
to a 5-position of cytosine or thymine or to a 7- 
position of deaza-adenine or deaza-guanine . 

The nucleotide analogue of claim 37, wherein the 
linker between the unique label and the nucleotide 
analogue is cleavable by a means selected from the 
group consisting of one or more of a physical 



45. 

25 



46. 

30 
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means, a chemical means, a physical chemical means, 
heat, and light. 

47. The nucleotide analogue of claim 46, wherein the 
5 cleavable linker is a photocleavable linker which 

comprises a 2-nitrobenzyl moiety. 

48. The nucleotide analogue of claim 37, wherein the 
cleavable chemical group used to cap the -OH group 

10 at the 3' -position of the deoxyribose is cleavable 

by a means selected from the group consisting of 

one or more of a physical means, a chemical means, 

a physical chemical means, heat, and light. 
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49. The nucleotide analogue of claim 37, wherein the 
nucleotide analogue is selected from the group 
consisting of: 



0- O* O" 

OR 



J* 4 * H 

io JL^=5^h. 



O' o* b- 




25 




wherein Dyei, Dye 2/ Dye3/ and Dye 4 are four different 
dye labels; and 



30 



wherein R is a cleavable chemical group used to cap 
the -OH group at the 3' -position of the 
deoxyribose. 
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50. The nucleotide analogue of claim 4 9, wherein the 
nucleotide analogue is selected from the group 
consisting of: 



ooo Sr*w o 2 n- 

-P-O-P-O-P-0— i I 



I t t 

o- o- o- 




10 



15 



ooo 

n n n 
"O-P-O -P-O-P* 

cr 6* 6 



H 3 CH2CHN — ° v r** ? N* NHCH J ° H 3 




20 



(HaC) 2 N 



OOO H,N . 

-o-^-o-P-o-P-o— i 

6* cr xr <Cl_^ 

OR 




and 



25 



30 



ooo 
•O-M-O-^-O-M-O-i _ I 

6- 6- 6- kr°-?i 

OR 




wherein R is -CH 2 OCH 3 or -CH 2 CH=CH 2 . 
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51. The nucleotide analogue of claim 37, wherein the 

nucleotide analogue is selected from the group 
consisting of: 



10 



15 



20 




wherein Tagi, Tag 2 , Tag 3 , and Tag 4 are four different 
mass tag labels; and 

30 wherein R is a cleavable chemical group used to cap 

the -OH group at the 3' -position of the 
deoxyribose. 
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52. The nucleotide analogue of claim 51, wherein the 
nucleotide analogue is selected from the group 
consisting of: 





30 



wherein R is -CH 2 OCH 3 or -CH 2 CH^CH 2 . 
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53. Use of the nucleotide analogue of claim 37 for 
detection of single nucleotide polymorphisms, 
genetic mutation analysis, serial analysis of gene 
expression, gene expression analysis, 

5 identification in forensics, genetic disease 

association studies, DNA sequencing, genomic 
sequencing, translational analysis, or 

transcriptional analysis. 

10 54 . A parallel mass spectrometry system, which 

comprises a plurality of atmospheric pressure 
chemical ionization mass spectrometers for parallel 
analysis of a plurality of samples comprising mass 
tags . 

15 

55. The system of claim 54, wherein the mass 
spectrometers are quadrupole mass spectrometers or 
time-of-f light mass spectrometers. 

20 56. The system of claim 54, wherein the mass 

spectrometers are contained in one device. 

57. The system of claim 54 which further comprises two 
turbo-pumps, wherein one pump is used to generate a 
25 vacuum and a second pump is used to remove 

undesired elements. 



30 



The system of claim 54, which comprises at least 
three mass spectrometers. 
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59. The system of claim 54 , wherein the mass tags have 
molecular weights between 150 daltons and 250 
daltons . 



b0. Use of the system of claim 54 for DNA sequencing 
analysis , detection of single nucleotide 
polymorphisms, genetic mutation analysis, serial 
analysis of gene expression, gene expression 
10 analysis, identification in forensics, genetic 

disease association studies, DNA sequencing, 
genomic sequencing, translational analysis, or 
transcriptional analysis . 
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Photocleavable Linker 





R = H, CH 2 OCH 3 (MOM) or CH2-CH=CH 2 (Ally!) 
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FIGURE 16 
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FIGURE 19 
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SEQUENCE LISTING 

<1'10> The Trustees Of Columbia University In The City Of 

<120> Massive Parallel Method For Decoding DNA And RNA 

<130> 0575/62239-B-PCT/JPW 

<140> Not Yet Known 
<141> 2001-10-05 

<150> 09/684,670 
<151> 2000-10-06 

<160> 2 

<170> Patent In Ver. 2.1 

<210> 1 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: template 
<400> 1 

acgtacgacg t 11 



<210> 2 
<211> 101 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: template 
<400> 2 

ttcctgcatg ggcggcatga acccgaggcc catcctcacc atcatcacac tggaagactc 60 
cagtggtaat ctactgggac ggaacagctt tgaggtgcat t 101 



1 



